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CHAPTER I 


INTRODUCTION 
(the nature of mental measurement) 

IVilA numbers all men may contend, tkeir charming ^sterns to defend. 

GOETHE 

W fc may l<»ok at the science of statistics from two angles. 
Firstly, it may be rcgaidcd as the process of collecting 
figures which rcpiesent such things as amounts of ex- 
ports, price le\cls. temperatures and b.irometric pressures from 
day to dav, examination marks and so on, for which some scale 
of measurement has been found in a world which becomes pro- 
gressively more metrical. Sccondls, statistics is the study of the 
means of manipulating and arranging figuics, applying mathe- 
matical processes and thereafter interpreting the results. 

Scientific woikers try to u.se the most effective language for 
their particular purposes. Clear verbal description is a necessity 
of course, but the precise language of mathematics is also necessary 
both to describe and to manipulate the results of observations. 
Scientists usuaUy feel that they arc on fii m ground when they can 
provide a 'measuring stick’ in ordei that they can give quantitative 
results at the end of their experiments and observations. It must 
be remembered that these results arc completely dependent not 
only on the accuracy of the observations, but also on the size and 
accuracy of the 'measuring stick'. There is nothing absolute about 
their findings; they afe merely.> matter of comparison witli an 
agreed unit of a scale, which in iteelf is an arbitrary measurement 
accepted by X ljq;ge''number of W'orkers as a convenient common 
standard. In the sciences where we begin w'ith measure- 

ments of length, wmh lead to those of area, volume and mass, 
and measurement of time, there are considerable difficulties in 
, fixing standards, (We assume, for instance, that time has certain 
' properties of Itmgth and direction, and may be thought to have 
sbme of the properties of a straight line. Great and bewplderinig 
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new discoveries were made by Einstein and others in the 
of physics, when some of the elementary foregone conclusic 
concerning measurements of length and time were challenged.] 

In the study of the ‘properties’ of the human mind, the probl< 
is much more difficult. The mind is not a thing to be measui 
and weighed as can the whole physical human body, or even 
brain. When we talk ab<jut the factors of the mind, the abilit 
or intelligence of man, we have to be careful to avoid the pitfall 
thinking of these as so many tangible quantities each capa' 
of measurement in terms of length, volume or force and so on. 
is only fairly recently that the ‘faculty’ psychology (which v 
kept alive by educationists long after its natural term of yea 
has been properly buried. The mind must not be thought of 
terms of a scries of faculties, such as intelligence, memory or v 
and it would be unfortunate if w e were to bury' ‘faculties’ and 
resurrect ‘factors’ in tlieir place. 

The study of arithmetic should always be sustained by logi 
thought, but many people tend to accept figures and numb 
uncritically. It has been said cynically that statistics are the wc 
form of falsehood. This ought not to be correct, but the posit 
may always be safeguarded by a critical examination of the thi 
or ideas which underlie them. A simple example of this s 
suffice. Some years ago some statistics were used in an unsenr 
lous endeavour to show that insulin therapy was useless in a 
of diabetes. It appeared tliat more people had died each y 
from this disease since the introduction of insulin than befor 
had been discovered. Moreover, the figures were correct 
they stood! A little thought will show that the figures had b 
used to sustain a false argument. 'Dlugnostt of the complaint 1 
improved and thus diabetes bad* later hgln as a caust 
death, whereas before, the condition was heartfaih 

pneumonia or internal mfiammatiun.' A|i^<llfl(q^tiag at a ca 
of death may be, from the statistical what re. 

matters is whether insulin has extended tuditl Uma, peihaps'u 
fairly advanced s^e, even though death eventiMdly trices place 
it must fia* everybody from one cause or another),. TblflMiclil! 
is that insulin is usdiil. 
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Investigations in the physical sciences are on the whole easier 
than those on the measurement of human and social factors. In 
the physical sciences we are usually able to isolate the property 
which we wish to measure and to insulate it, so to speak, from dis- 
turbing external influences. Different physical properties do not 
usually cause mutual perturbations which worry the physicist. In 
any case he can allow for them accurately. He is not usually 
worried about the barometric pressure of the room, the colour, 
the magnetic and electrical properties of a piece of metal when he 
is measuring its specific he.U. Moreover, he is able to use units 
which Ciin be measured in a linear way and about which there is 
universal agreement. 

The matter is not so simple for the psychologist, educationist 
and even the biologist, for they find it difficult, or even impossible 
to proceed from cau.se to effect.* The quantities which we think 
we have isolated and measured today have changed by the 
morrow. When we believe that T\e have isolated a physical system 
in the living body or a ‘factor’ in the mind, the integration of 
function and the working unity of tlic whole have to be taken into 
account even when we hope that wc are studying some specific 
small ‘part’. The twofold aspects of mental activity, the cogmHve 
or intellectual and the oreriK or striving and emotional have to 
be thought of as being distinct wdien we try' to measure various 
manifestations of cither of them. It does not need much experience 
and thought to see that tlicrc are enormous difficulties in isolating 
their factors. It is one of the triumphs of modem experimental 
psychology and statistical analysis, that in a large measure we 
have been able to eleac 4iway misconceptions concerning the 
so-called 'factoni of fha mind* and to substitute ideas which are 
based on scienfjjfe pc^||^e» Although wc cannot always resist 
the letnjptatiop certain well-marked aspects of mental 

activity, w^iiMiitfdplpnlhe temptation to think of these aspects as 
'concrete qunptilftiltnrfiTf if we discover a scale by which they can 
be qrtimaie(|^#t||«iUitative basb. We shall meet this exceedingly 
im{tivtaiit oqib^eration again. 

>4$ problems which try to provide information 

* lathe Uet enalyik the physicist does too. 
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concerning the world external to the investigator can be thov 
of in three stages: 

(1) The collection of data, taking care that we have the pn 
‘measuring rod’ for the job in hand and that we know hov 
use it. 

(2) By mathematical processes, the manipulation of the fig 
of the data, and eventually the ariival at a numerical result. 

(3) The interpretation of the result in relation to the orig 
data. We apply the result to give us further information o 
predict possible future happenings. 

At length we may go from generalizations to tentative ‘la 
Unfortunately, the second step is the only one vihich has I 
stressed in schools in the past. Really, it is but a link in a rr 
important and lengthy chain of reasoning. 

To make this matter clear let us take as an example a prob 
from psychological research. Suppose we wish to find whci 
there is any general measure of agreement (correlation) betw 
ability in classical studies and general intelligence. In the 
stage of our investigation w'C have to evolve a suitable examina 
in classics for each age group, w'hich will ensure that cvcr> 
has a fair chance and that there are sufficient questions 
examinees to avoid errors of sampling. The examination ps 
should be suitable for ready marking on a scale which i' 
keeping with certain statistical requirements. The measxiren 
of intelligence is not such an easy matter. Nevertheless, with 
enlarging on the considerable difficulties which beset a task wl 
many people imagine to be relatively "we will as.sume th 

set of marks in classics and a score in ap intelligence test givei 
the same large number of pupils have been obtained. 

The second stage is the mathematical process wheyeb; 
coefficient of correlation between the marks in classics and 
scores in intelligence tests is obtained. 

The last stage is to ask whether this coefficient is significi 
how many times larger is it than the probable error, what is 
meaning and value of this correlation, what relationship has i 
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other possible correlations, and to what conclusions and further 
investigations of educational significance if any, will it lead. 

Although we have used the term ‘yardstick’ loosely in dealing 
with mental characteristics, it must be noted that there is a great 
diffciencc between mental measurements and those of tangible 
and physical quantities. For instance, a length of seven feet b 
equivalent tf) the sum of the lengths of seven separate feet, but a 
simil.ir consideration docs not apply to the type of numerical 
abstraction which is obtained in the measurement of human 
abilities or .sensory discrimination. Mental measurements have 
to be made by indirect means and arc further complicated by the 
fact that the very things which ;ire measured are ill-defined and 
that psychologists may even differ as regards the definitions of the 
factors which it is proposed to measure. The measurement of so- 
called ‘general intelligence’ is a case in point. All psychological 
measurement involves sampling and it is necessary to take steps 
to ensure that the sample is fully representative of the group, and 
secondly that it is large enough to reduce errors of sampling to 
small proportions Moreover, it is neccss.u7 to know what arc 
the possible errons which may mar an estimate made with samples 
of particular sizes. In addition to errors wltich are due to sampling 
there are other difficulties. IVe must know the degree of validity 
of a test as a measure of a particular characteristic. It has been 
claimed that tests have been evolved which are a ‘measure of 
pureg’ (Spearman’s general factor). On investigation, it is found 
that such tests arc ‘loaded’ (or saturated) with g to little more 
than 70% of their whole variance. Again, a test should have self- 
consistency or reliability, tf it is divided into two parts by taking 
the odd and even nafllbnr^ questions separately, there should 
be a high degree of agreement between the results scored in each 
half of the test. Although consistency in a test is essential to its 
validityjt is not, of course, sufficient to determine the latter. We 
shall with these matters in a later chapter. 

Finally, in educational measuronent there is always the possi- 
bility o£ irrelevant factors disturbing the estimation of particular 
characteristics. Hitherto, most mental measurements have dealt 
with the cognitive or intellective factors of mental activity, awl 
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it is difficult to separate these from conative or emotional disti 
ing elements. Finally, even the simplest individual is a rich 
complex integration of mind and body which is fluctuating f 
day to day, or even from moment to moment. The physic 
brass weight is not sensibly different today from what it 
yesterday, but the human body-mind can never be the same, 
it may have changed considerably. 



CHAPTER II 


DISTRIBUTIONS AND DISPERSIONS 
OF SCORES 


I F we measure the heights of a large number of boys of the 
same age, we find that they are distributed in a definite way. 
We can imagine the boys lined up against a long wall starting 
with the smallest boy and making each successive boy slightly 
higher than the last, in going from left to right. The line joining 
the tops of their heads will be a curve with a shape which would 
be an elongation of the following: 


I- 

X 

» 

III 

z 


CgMULATIve NUMBCK OF INOIVIOUAL8 

Fig 1. An ogive or (umuUtn« frequCTicv tune The cun e can ulso be drawn 
With the number of caso gi\ cn \ erticdUy (ordinates) and the marks or other measures 
given horizontally (absiissae) 



It is known as an Ogive (because of a similar curve which 
appeared in classical architecture). We could obtain the same 
carve by picking a thousand ears of wheat from a field (or a large 
number of peapods of the same crop) and arranging each of them 
vertically in a horizontal row, starting with the smallest and 
finishing with the longest. In biology and psychology we can think 
of many measurements of a similar kind made on a laige number 
^ things of the same type, which would give an ogive if plotted 
ajna'tibis way. We shall meet this curve again when we are dealing 
liratik piBreentiles. It is sometimes known as a cumulative Jrequmy 
it is oRto more useful to find the Jrequemy or the number 
caste occunibg in each range whether of height, weight, marks, 
. j j ^ j 8li genc% Quotient, etc. An easy way is to plot a IRstoorau. 
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Consider the following distribution of marks in which each step 
is one of lo marks. 


Marks 

Ao. of Pupils 

0- to 

3 

to- ao 

12 

ao- 30 

21 

30- 40 

28 

40- 50 

35 

50- ^ 

37 

Go- 70 

29 

70- 80 

*7 

0 

1 

0 

00 

10 

90-100 

5 


The height (and therefore the area) of each column gives a 
measure of the number of pupils whose marks lie between the 
figures at the foot of the column. The whole area of the rectangu- 
lar column.s gives the total number of pupils. Here a word of 
warning is ncccssar), and it is wise lo keep in mind the scales 
which are used for the marks along the horizontal axis and for the 
frequencies which are vertical measurements. The value of a unit 
area on the graph will serte as a guide. The luslogram is some- 
times spoken of a.s a Column Diagram. 



Fig. 2. IliKtogram. 
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Suppose we now consider the mid-points of the top of each 
column to be joined by straight lines and completed at each end 
by further straight lines joined to the hoiizontal line as shown in 
the diagram. We then have a Freq,uency Polygon. The fre- 
quency polygon does not give quite the exact picture of the data 
which is yielded by the histogram, especially when the number 
of cases is small, but frequency polygons may be superimposed 
and compared and this is a useful property. 



It will readily be appicci.itcd that if we take a large number of 
cases which show distribution in a regular manner, the frequency 
polygon will take such a shape that it suggests a ‘smoothness’ 
which would tend to a curvi if the intervals of marks became 
smaller as the numbers of cases became larger. 

We now come to a most important case of frequency distribu- 
tion. This is represented by the curve of normal distribution or what 
whs formerly called, the curve of error or the probabiUty curve. 

Suppose we measure the heights of 10,000 adult Englishmen 
and plot a hbtogram showing the numbei' in each half-inch range 
from (say) 58 inches to 77 inches. (It is possible that we may even 
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have to extend the range to include men smaller than 4 feet 
10 inches, and those taller than 6 feet 5 inches.) If wc can now 
join the mid-points of tlic tops of the columns and then smooth 
the frequency polygon to make a curve wc should get a shape 
like the following: 



HiIs distribution is of the utmost importance in science. We shall 
refer to it as the curve of normal distribution. It used to be 
called the curve of error because it showed astronomers the 
distributions of the errors in thcii' readings about the correct 
value, or again, in gunnery it gave the frequencies of the mi&siles 
in respect of their distances from the target after the range had 
been found.* The curve is also known as the probability curve 
for reasons which will be apparent in a later section of this botsk. 
If a curve is not symmetrical about a line drawn .thxpugb its 
highest point it is said to be Skewed and is known as a Skew 
Curve. 


* For the properties of this unportent curve see Chatter V'snd the sf^aodiz. 
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Bdow is a positively skewed curve and the greatest frequency 
occurs before we come to the middle ‘score’; 



and this is a negatively skemd curve and the greatest frequency 
occurs after the middle score. 



We 5ha{|.aee how the skewing of curves of examination marks and 
test scores affects the value of the investigation, when we come to 
apply these matters to the problems of marking. 
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A cun’e like the following is known as a bimodal curve because it 
contains two humps, modes or most ‘popular’ scores. We might 
obtain such a curve if we gave an intelligence test to a large 
number of children which consisted of two groups whose abilities 
were sharply divided. 



It will be observed that the curve of normal distribution is 
symmetrical about a vertical line drawn through its highest point. 
If instead of the heights of a large number of Englishmen, the 
curve were made to represent the .scores of a large number of 
children in an examination, this line would be a measure of the 
maximum number of children in any of the mark groups. In the 
case of the symmetrical curve we sec that (o) the mark which was 
scored by the greatest number of children was the average mark 
of 50%, (A) the middle child in an order of merit list scored the 
average mark. This is obvious as the area enclosed b) the curve 
to the left of the central straight line is equal to the area enclosed 
by the curve to the right of this line. 

It will be noticed that in this and other curves there is a central 
tendency. The average value (score, mark, height, etc.) is called 
the Mean. The value of the middle case (c.g. the mark of the 
pupil who is half way down an ordcr-of- merit list or rank) is 
called the Median. The score, mark, height, etc. which relates 
to the largest number of individuals is called the Mode. (This is 
also the meaning of the word in ordinary life.) 

Example'. The following is a list of marks obtained by school' 
children in a geography test. Find the mean or average. 
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Pupil 

7. 

A 

45 

B 

70 

C 

21 

D 

32 

E 

51 

F 

68 

G 

48 

H 

39 

I 

J? 

J 

84 

K 

64 

L 

60 

M 

44 

N 

92 

0 

15 

P 

3 > 

16 

781 


CORES 


Divide 16)781 (48-8 

Average 48-8% 

Add each column doNNm and check by adding up: tick the 
column total when agreement is reached. 

If the marks are represented by x 
Tx 

The Mean M = , where S (.sigma) is the sum of (the scores) 
ai|d N is the number of pupils. 

An easier way of calculating an average (especially' where there 
is Jio great spread of the measures) is to guess the mean and 
then adjust it by summing the differences of each measure from 
this mean and dividing by the number of measures, c.g. 
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Find the mean of the following marks: 

Guessed 

Difference 

Pupil 

Marks 

Average 

+ - 

A 

61 

50 

11 

B 

40 

50 

10 

C 

52 

50 

2 

D 

37 

50 

13 

£ 

71 

50 

21 

F 

47 

50 

3 

G 

54 

50 

4 

H 

32 

50 

18 

I 

73 

50 

23 

J 

45 

50 

5 

K 

64 

50 


L 

38 

50 

12 

M 

41 

50 

9 

N 

50 

50 


0 

46 

50 

4 

P 

53 

50 

3 


78 74 


1 6 pupils. + 4 

Mean = 50 -f- A = 5 oi 
This method may be expressed as follows: 

2D 

M = A 4- where A is the guessed or arbitrary mean and 

D is the sum of the differences (deviations) of each measure &om 
this mean. 

Me£an 

The median is the nud-point in a distribution and the nuofber 
of cases above it is equal to the number below it. It is easy to find 
the mid'point of a distribution which has an odd number of 
cases, e.g. 3. 4. 5. 5. 7. 8. 8. 9. 10 for clearly 7 is the value of the 
median wldch is the fifth case. 
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If N is the number of cases and is odd, the median is the 
N + I 

th case. In a distribution with an even number of cases, we 

2 

must take the mean value of the scores just above and just below 
the centre point. ‘ 

e.g. in 3. 4. 5. 5. 7. 8. 8. 9 the median falls between 5 and 7 and 
can reasonably be given the value 6. 

From tliis we can extend our division of the distribution into 
quartiles and percentiles. In the following distribution; 

2. 2. 4. 5. 6. 7. 8. 9. 10. 10. 12. 13. 14. 14. 16. 
it is easy to sec that 5. 9. 1 3 respectively are the values which lie 
i) i, i of the way along the distribution. 

N + I 

The measure representing the first quaitHc Q,i is the th. 

4 

The measure representing the second quartile (median) is the 

^-t‘th. 

2 

The measure icpresenting the third quartile Q,, is the 

4 

When the number of mea.sures increased by one is not exactly 
divisible by 4 the same formulae hold: in the case of a large number 
of cases it will usually suffice to give the value at each quartile 
point as that of the nearest ca.se. When vs'c have a smaller number 
of measures ^n estimate of the values can be made by simple 
interpolation. 

We may extend the division of the distribution into percentiles 
(too divisions),or deciles (10 divisions). 


x(N + 1 ) 


too 


from the 


The sfth percentile is the measure which b 

bemning or lower end of tlic distribution. It is often convenient 
toti^lok'^ercentile scores on a piece of graph paper on which a 
frequency curve or histogram is abo drawn. 

^ If we know the marks at the ist, loth, 25th, 50th, 75th, 90th 

* Median i» not quite the seme thutg *)md-acore’ «s the medum is strictly 
• point and the mid>soore will have a duciete value. 
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and 99th percentiles, we have an excellent idea of the distribution 
and by plotting a graph we can find a score corresponding to a 
percentile, and a percentile (which gives us an idea of order and 
merit or rank in the distribution) corresponding to a given score. 

In a normal distribution a difference in percentile rank corre- 
sponds to a greater difference in scores at the beginnings and ends 
(the tails) of the distribution than in the middle. In fact as regards 
mark equivalents the 1st, 6tli, 2:2nd, 50th, 78th, 94th and 99th are 
about equally spaced. We cannot theicforc take the averages of 
a pupil’s percentile levels in various subjects in the same way that 
we can combine his scores. 



ao so 40 so 60 70 so 90 

SCORES IN TEST 

Pig 8. Percentile ecale for class of 70 Here the scores are plotted honsontalty 
and the percentile levels and their equivalent cumulative frequencies are plotted 
vertically. A point on the graph wiU give the score at any percentile level, or the 
total number of people who luve not reached a certain mark 

Finding percentiles when data are given in tabulated farm 
The results of examinations and tests are often given in tabulated 
form and sometimes the statistical treatment of sets of marks is 
easier if they are put into group frequencies. 
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Consider the following scores in an intelligence test. They arc 
given as the frequencies (the number of persons tested) falling 
into each score range of 5 marks; 


Test Scores 

Frequetuy 

Cumulative Frequetuy 

135- >39 

0 

0 

1 30- 1 34 

5 

0 

125-129 

8 

>3 

1 20- 1 24 

9 

22 

115- "9 

12 

34 

1 lO-1 14 

18 

52 

105-109 

‘-*5 

77 

100- 104 

18 

95 

95- 99 

20 

115 

90- 94 

»3 

128 

«5- 80 

fr 

>34 

80- 84 

7 

141 

75- 79 

Total number -- 143. 

2 

>13 


The majority of percentile lc\'els will fall inside one of the 
classes or score ranges. In the above example with an awkward 
number such as N = 143 all of them will fall within a class. 

We can find the percentile (rank) corresponding to a given score 
from the following formula; 



too 




where P = percentile, Xp the value of the test score or other 
measure falling at this percentile level. 

1. is the lower limit of the class in which Xp lies. 

S is the sum of all the frequencies (the number of persons 
tested) up to but not including this class, 
f the ’frequency wdthin this class. 

N the total of all the frequencies. 

C the size of this class. 
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Example: Find the 77th percentile score if 

P = 77 N = 143 L = ii 5 f=i2 C = 5 S = 109 


= "5 + ~ -- - - togj 

== ”5 + (iii-i - 109) 


= H5 + 


12 


(2.1) 


= ii5'9 (n6 approximately) 


Percentiles also offer a useful way of comparing sets of marks no 
matter what are the scales of marking. 

It is obsious that there is some advantage in giving a student’s 
score in terms of a percentile for then tlic middle of the rank 
would always be the 50th percentile. The unfamilidrity of this 
method to the layman or the uninitiated would probably lead to 
errors in its interpretation. Although percentiles give a ready 
means of comparing distributions they must not be u.sed for 
combining them. Obviously percentile units are much closer to 
one another near tlie middle of a distribution than they are at 
each end. 

The mean, median and mode are v.irious ways of regarding the 
central tendency in a distribution but it is also necessary to have 
a measure of the spread or dispersion of the set of marks dr other 
measures. In order to secure a proper arrangement of a number 
of pupils in order of merit it is obviously necessary that the marks 
should not be bunched together at any point but should be 
properly distributed. Again, when we come to considtf the 
problems of error in estimating psychological ‘factorj’. it is 
necessary to know how the errors are distributed. T&ese are 
two of the many instances of the use of methods of estimating . 
dispersion in mental measurement. 
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Interquartile Range 

The quartilc deviation is widely used. If the scores are arranged 
in rank or order of merit, the difference in score between the first 
and third quartilc points is known as the interquartile range. We 
arrange the scores in order of merit, find the score which is a 
quarter of the way along the distribution and that which is 
three-quarters of the way along the distribution and subtract the 
scores. Dividing by two gives the quarlile deviation Q, (or the 
semi-interquartile range) : 

o = ^ 

^ 2 

It will be obsert'cd that Q., is the score of the mid-point in the 
order of merit list. It is therefore the median srore. 

It will also be seen that as half the scores on one side lie between 
the median and die first quartilc point, and between the median 
and the third quartilc point on the other, the interquartile range 
gives a measure of dispersion. If the median score can be taken 
as a particular mea.sure and the other scores in the distribution 
as differing from it (either above or below) as deviations or errors, 
the interquartile range will contain half the deviations and there- 
fore the semi-interquartile distance can be regarded as the 
probable error. ^ 

Mean Deviation or Average Deviation from the iMean. {Mean Variation) 

The deviations (differences) of the scores from the mean or 
average are all regarded as positive and added together. This 
sum is iMvidcd by the number of individuals or cases. 

Mean desdation M.D. = ^ 

N 

Stoned Deviation 

This measure of dispersion or spread is of great importance and 
is that,yhich is usually of the most value for mathematical treat- 
ment and for the calculation of correlation coefficients. In finding 

* The range from the loth to the 90th percentile* called D by some writers ie a 
uaera meaaure of diapersion. 
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the Mean Deviation above, we regarded each of the deviations 
as having a positive sign, which was not actually true- If each of 
the deviations is squared this difficulty is overcome. Moreover, 
the squaring of each deviation will tend to give due weight to 
any comparatively large dc\iation. It also remains to be said that 
the use of Standard Deviation is in keeping with the matiieinatical 
properties of the curve of normal distribution. 

To find standard deviation each deviation is c>dcuIiUed and 
squared. The column of squares is summed and this sum is 
divided by the number of cases and finally the square root is 
taken. S.D. is ‘root-mcan-square' and is usually represented by 
the small Greek letter sigma o 



Sometimes when we are comparing sets of scores it is necessary 
to add a subscript to sigma, thus o, or a, to indicate to which 
group of marks the standard deviation refers. Readers who arc 
not familiar with mathematical notation need not be worried 
about the sign 1 which is the large Greek sigma S and means 
‘the sum of — 

Students should consider the following four methods of com- 
puting the standard deviation, and choose that w’hich appears to 
be the easiest and most labour-saving in view of the given data. 

I. The direct method. The mean (or average) is found, the 
deviation of each score or mark from the mean is calculated, these 
are squared, added and the square root is found. 

In all these methods of calculating the standard deviation a set 
of tables of squares and square roots such as Barlow’s, logarithms 
and/or a simple slide-rule will be useful. It is hardly ever necessary 
to give the answer correct to more than two places of decimals and 
usually one will suffice.* 

* A word of wotninii ought to be goen concerning the finding of square roou. 
A rough mental eotinute wiU always give the clue to Ute panicuUr square which it 
reqiured and where the decimal point should be placed. 

To square a number by lugarittuns, double the log of the number and find the 
antilog. To find the square root halve the logarithm of the number and then find 
the ontilog. See the appendix for the use of the slide-rule for this and other 
purposes. 




2. ihc inci'ii docs not turn out to be a whole number 

and tlie squares of the dev'iations contain decimal fractions which 
cause considerable labour. In this case we quess a mean which is 
a It hole numbei and then appl> .i correction. A quick mental 
calculation ssill sullitc to suppl) the arbitrary mean. 

Mark D D D* 

10 10 G q 16 - ^ ^ 5-83 

3 3 3 9 

N - 6^ 7 76 I I 

8 862 4 

5 56-1 I 

I 4 4-0 - 2 4 

3 .’) ■ 

Guessed mean A = G 

True mean M = = 6-17 

0 

/tT)i 

The formula for S.D. in this case ~ ij ^ — (M — A)* 

. o = V 5'83 - {6 »7 - 6)*' = Vs-Ss - 03 
=* VS-S = 2-41 
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3. When there arc only a few numbers to be considered and adl 
the scores or marks are whole numbers, it will suffice to call the 
arbitrary ‘mean’ zero. Thus, the deviations (D) will be the 
original marks (x) and the formula then becomes 



fix* 


® A 


— M* 

Afark x 

X * 


10 

too 


3 

9 


7 

49 

V 6 ' 

8 

64 

= V 43-^3 - 38 03 

5 

25 

= \/5-8 

4 

16 

= ■2-41 

M = = 6.17 2x‘ 

= 263 



4. The mean can be calculated at the same time as the .standard 
deviation by using a modification of the formula on page 2 1 which 
now' becomes 



which is obvious when we remember that 

True Mean = ~ + A (.Arbitrary Mean) 
and D is the deviation from tlie guessed or arbitrary meiu;i. 

Calculation of the Standard Detnation when the measures are given in 
grouped frequencies 

Even with the use of tables, slide-rules and calculating machines 
there is considerable labour in calculating the S.D. of a large 
number of measures. This may often be simplified by putting 
them into frequency groups. Or it may happen that the mesisures 
are originally given in this form. 
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The formula then becomes: 



in terms of the size of the intcr\'al (or extent of each group). 

If we wish to express the formula in the same units as the 
measure (i.e. in score form) the formula is 



where i is the size of the interval of each group. 


When a calculating macliine is used the easiest form of this 
expression is 

a = VXZfD^iifDy 

In each case all the scores in the intci'val are taken to have a 
value equal to that given by the mid-point of the interval. D is 
the deviation of each measure from an arbitrary’ mean and f the 
frequency, i.e. the number of measures in each class or interval. 


^xatnpU: In the following table the marks are given in the first 
columns, the mid-points of the intervals in the next and then the 
frequency in each interval. Find the S.D. 


Mid-Point 


Marks 

of Interval 

/ 

D 

/D 

/D* 

91-100 

95-5 

I 

+ 4 

4 

16 

81- 90 

, S 5-5 

a 

+ 3 

6 

18 

71- 80 ' 

75-5 

3 

+ a 

6 

la 

61- 70 

e.'i'S 

6 

+ I 

6 

6 

51- 60 

55-5 

II 

0 

0 

0 

41- 50. 

45-5 

la 

— I 

— 12 

12 

31- 40 

35-5 

10 

— a 

— 20 

40 

a I- 30 

25-5 

6 

-3 

- 18 

54 

II- ao 

15-5 

3 

-4 

— la 

48 

1- 10 

5-5 

I 

-5 

-5 

25 


N = 55 yn = - 45 2 /D* = asi 
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2 /D* 

N 


55 


= 4-20 


S.D. = lo plPl _ 
V N 


(?^) = iov/4-20 - -67 
= 10 X 1-88 
= i8-8 


10 ^ 3-53 


Sheppard's Correction for Grouped Data 

Wlien the measures are grouped into a frequency distribution 
the S.D. calculated by the method abov'c is somewhat larger than 
it would have been had the measures been dealt with separately. 
It can easily be seen that when the deviations arc squared, those 
that lie beyond the mid-point will add relatively more to the sum 
than those that lie on the ‘smaller’ side.* In the case of a normal 
distribution Sheppard has shown that in terms of interval units 
the ff* should be diminished by ^ of its value. Thus the corrected 
S.D. will be given by (\/a* — &) X t where o is the crude S.D. 
found from the grouped frequencies. This is equivalent to 


corrected S.D. 



or V^NI/D*-(Z/D)*-‘^-' • , 

As we shall see later when we are studying normal distribution 
the standard deviation is a most important measure of dispersion. 
For instance, if we assume iiormal distribution and know the 
value of the mean (wltich in this case will also be equal to median 
and mode values) we can calculate in terms of the standard 
deviation the^ value (number of cases) for any x value (score or 


* The matter ia further complicated by the fact that each interval in the diagram 
hat a trapeaoidai shape. 
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marks). If wc assume, for instance, that intelligence quotient I.Q,. 
is distributed normall> ' and wc know the standard deviation and 
can assume a mean of 100, we can at once calculate the percentage 
of population possessing particular intelligence quotients, or with 
I.Q_.s between one level and another. This will be understood by 
a consideration of the properties of the curve dealt with in 
Chapter V. 


Standardized and Jt'ormalized Scores 

If the scores in a test are represented as measures below or 
.above their average, and they are then divided by their standard 
deviation, they are represented by etc. and are said to be 

standard (or z) scores, .\pproximatclv two-thirds of the scores will 
he between i and — 1. If the scores can be taken to be distributed 
normally each set of scores can be regarded as equivalent and 
comparable. Standard scores ran be regarded as deviations from 
the mean which have been adjusted so tliat the standard deviation 
is rinity. (It is possible that to call the average mark o and to 
make all marks below it negative, may have a bad psychological 
value, but in tlie statistical handling of scores it is often the most 
convenient way.) Sometimes the scores are normalized by dividing 
their differences from the mean by o-\/N, that is, by the product 
of the standard deviation and the .square root of the number of 
persons. Standardized scores ran be converted to normalized 
scores by dividing by the root of the number of persons. In the 
case of normalized scores it will be seen that the sum of the scores is 
unity,* and as we shall see later the sum of their products is the 
correlation eo^cient. 

The .variance of a set of scores is the square of the standard 
deviatioiL Where a set of scores has been standardized the 
variance will clearly be unity. We shall use this again when we 
meet factorial and variance analysis. 

It may be useful to return to the question of percentiles and to 
diink of them in terms of standard scores. 

* There ia evidence (hat thu is not quite true. 

' Sm Appendix VI. 

C . 
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Assuming a normal distribution; 


Percentile 


Slandaid Seou. Deviation from 

Level 

Mark 

mean (50) — S.D. 10 

99 

73 

+ 2-3 

90 

<>3 

+ *-3 

75 

57 

+■ 7 

50 

50 

0 

25 

43 

- -7 

10 

37 

- 1-3 

1 

27 

- 2 3 

The limits of the distribution 

arc taken to be + 3 S.D to 

- 3 S.D. 



(In the area under the normal ciiive (sec Chapter \') only 

•*33% of measures lie 

outside this r.ingc ) 

Foi psycliological reasons tlie 

mean might be taken as 60 

instead of 50, all the marks then being raised by 10. This does 

not aflect the distribution. 



Intelligence tests differ 

with respect to both their mean and 

their standard desiation. 

Scotes can only be compared by 

standardization. In the Moray House Tests the mean is taken as 

100 and the S D 15. 



Percentile Score 

Standard Score' 

99 (approx.; 

*35 

-r 2-30 

95 

*25 

4 i- 7 «* 

90 

120 

f 1.30 

84 

**5 

-i l oa 

75 

1 10 

4 -70 

50 

100 

0 

25 

90 

- -1° . 

16 

85 

— i-oa 

10 

80 

- i-S® 

5 

75 

- l^^a 

I (approx.) 

65 

— 2-3,0 


' Some writers <lu not differentiate between standard and standardized scotea, 
but this need not cause the reader any confusion. A standard score really means a 
score given as a deviation from the mean with the standard deviation as unit, i e. 
deviation divided by standard deviation Standardized scores mezn those that 
have be«n adjusted to an agreed mean and standard devwtion. Before such 
adjustment the scores are called raw scores. 



DISTRIBUTIONS OF SCORES 


27 

It will be observed that the scores with standard deviation from 
the mean fall at the 16th and 84th percentile levels. 

Sometimes it is necessary to convert these sigma or c scores to 
a scale with a given mean and a given standard deviation. Such 
an operation would also obviate the necessity of using negative 
scores and those with decimal fractions. Such scores were called 
t scores by McCall in How to Meamrt in Education. All that is 
necessary is to multiply each z score with the given S.D. and add 
to or subtract from the given mean. 


Measure of Skewness 

If a distribution is symmetrical, its median, mode and mean 
are at the same point. If a distribution has a positive skew, that 
is, if It has n long tail stretching towards the high scores, its 
median will be less than its mean and its mode will usually lie 
between these. 


Skewness Sk 


mean ~ mode 
standard deviation 


M - Mo 


or 


a 

Sk = 

a 

where Md is the median. 


[A less useful measure of skewness is given by 


Sk = p, = 


N'a* 


where the ;(’s are deviations from the mean, and N is the number 
of measures in the distribution.] 

The shape of a symmetrical distribution is measured by its 
kurtosis or flatness p, 

Zjt* 

P* ~ No* 

For normal distribution pt = 3.] 

, , , mean — median 

Mode = mean — 


G 
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For many curves and for moderate degrees of skewness C = J 
Thus, to compute the mode from the mean and the median 

Mode = M - 3(M — Md) 

= 3Md — aM 

(which could have been obtained by equating the first two 
expicssions given above for Sk.) 

Coffficient of Variability 

The relation which the probable deviation bears to the mean 
score is of interest as it gives a measure of the variability. We 
have already seen that the semi-interquartile i angc Q_ is equal to 
the probable deviation (P-E.). 

Thus the variability is ~ 

M 

If this is cxpi cssed as a percentage it is called the coefficient of 
variability. 

V - 

M 

This is quite independent of the measures used, whether they 
are marks or the weights of human beings. In general, if V is 
greater than i or 25% the dispersion is regarded as being rather 
large and the results should be used witli great caution.' 


' V ts also used for Variance, and its two uses should not be confused 
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CORRELATION AND REGRESSION 

I F wc consider the marks in science and mathematics gained by 
the members of a class, we should feel justified in expecting that 
there may be some relation between them. We should hardly 
anticipate that the top boy in science would also be tlie top boy in 
mathematics and that all the boys would have the same orders in 
both subjects until we came to the unfortunate boy who was at 
the bottom of tlic list in science and also in mathematics. If this 
curious relationsliip between the mark lists in these subjects did 
exist, with its exact correspondence of one order to the other, we 
should say that the marks were perfatly correlated positively. If the 
orders of the marks in both subjects were reversed, the top boy in 
one subject tvas the bottom boy in the other, tlie second boy in 
the science list was the last but one in the mathematics list, and 
so on {this is unthinkable, of course'), we should say that here 
was a case of perfect negative correlation. If the marks in science bore 
no relation at all to those in mathematics we .should say that there 
was no correlation. In practice we should expect to find some 
positive connection between marks in these two subjects, but it 
would be partial or imperfect correlation. This type of correlation 
is most important when we consider examination marks, and the 
scores in psychological and other tests; and cx;ict mathematical 
methods for dealing witlt it arc of the utmost importance in many 
educational and psychological researches. The correlation coeffi- 
cient is almost as important to the psychological tester as is the 
balance to the chemist. As we shall sec in a later chapter, many 
extraordinary assertions were made by educationists and psycho- 
logists in the past, and continue even today, because statements 
concerning human abilities or ‘intelligence' had not been subjected 
to rigorous analysis in which tlic use of correlation coefficients is 
invaluable. Nevertheless, other techniques are sometimes more 
valuable, but a clear idea of correlation is none the less of prime 
importance. 
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We can obtain a useful graphical idea of the degree of correla- 
tion between sets of numbers by plotting a scatter diagram oi 
scattergram. Suppose we plot the scores in two subjects or tests ol 
a number of individuals, giving a point on a two-dimensional 
graph to each individual. The co-ordinates of each point {x.y.) 
arc measures of the scores in each subject. Suppose further that 
the scores have been standardised by calling the mean (average) 
of each set zero, and then dividing each deviation from zero by 
the standard deviation of the set. 

If there were no correlation between the x and y values (the 
scores in each test) the points icprcscnting the individuals w'ould 
be distributed in a haph.izard manner over the graph papei, 
that is to say, there would be a fairly even density of points on the 
graph paper, provided that we had taken results from a sufficiently 
large number of individuals. If tlicre existed some degree of 
correlation between the x and y scores, we should find that the 
points tended to bunch togcthci iind were more dense in a certain 
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direction than in others. Wiicn we plot points in this manner we 
arc said to have made a scatter diagram or scattergram. 

Where correlation is present vve can find a line which best fits 
the distribution of the points which we have plotted. Few, if any, 
points will lie on it, but the line will go through the cluster of 
points so that there is a ‘balance' of points on each side of it. It 
would then be the line of best fit, and would be known as the Une 
oj regression.^ (.\lthough this term is not p.irticulaily apt for 
psychological work, it is invariably used. It is a biometric term 
used by Galton to show that the average heights of ofl’spring tend 
to ‘regress back towards the mean of the laee’.'j 

Suppose that the correlation is a perfect positisc one. The 
points w'ould be bunched together in the first and third quadrants 
and the line of best fit would make an angle of 45° with the 
positive X axis If, on the other hand, there w.is perfect negative 
correkition, the points would be bunched together in the second 
and fourth cjuadrants, and the line of regiession would be at 
right angles to that representing perfet t positive correlation. In 
education and psychology we usually find that correlation, if 
present, i.s partial positive correlation. Thus we shall find the 
lines of regression in the first and third quadrant.s (oi if we are 
dealing with ‘raw’ or uiiticatcd scores upwards fiom o in the 
first quadrant). 

The slope of the regression line, that is its value, or the 

tangent of the angle which it makes witlt the x axis, is equal to r, 
the correlation coefficient.* 

In the case of perfect positive correlation, writing x, and x, as 
deviations from their means and a„ o, as the respective standard 
deviations. 

X, 

p = I (= tan 45°) 

*1 

* The sum of the squares of the distances of the points from the line should be 

• minimum. 

* See Appendix I. 
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When there is no correlation we should have to regard the 
regression line as being horizontal, with a slope of o. 

In general the slope of the regression line of *, on is given by 

X, 

S 

Ot 

Thus the coefficient of correlation r is 

*t 

Of 

0*1 , 
or jc, = r;r,— . This is the equation of the regression line. 

The regression line of Xi on x, makes the same angle with the 
vertical axis as the regression line of x, on x, docs wdth the hori- 
zontal axis. The equation of this line of regression {xt on Xi) is 

tt, 

X, = rxi - 
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Before leaving the subject of regression it may be useful to note 
that regressions do not seem to obey the ordinary algebraic rules, 
for instance, the regression of x ony may be vritten x —ry and 
that ofy on x will bej = rx. Thus the regressions occur in pairs 

X =ry 
y = rx 

The following diagrams will help to explain the phenomenon of 
regression. 



Fig. II 

In each case the vertical lines represent two sets of scores. In 
[b) there Ls no correlation, and any set of frequencies in one 
set may be matched with any score throughout the range of 
the other. In [a) where there is some positive correlation tlie 
frequency group in one set corr«ponds with a spread of scores in 
the second, but for the most part in the same side of the mean. 
In {£) there is negative coirclation with the spread tendency to 
be on the opposite side of the mean. In (c) there is peilect 
correspondence between the scores and correlation is complete 
and thert is no ugression. Thus it can be seen that from regression 
equations we can estimate the value of one or other variable for 
an individual when we know the correlation coefficient between 
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the two variables. As the correlation coefficient increases, our 
accuracy of estimation of the one variable will improve as shown 
in the table and diagram on page 48. It will repay the student to 
consider the significance of the correlation coefficient, as it appears 
in regression equations, and its use in the prediction of the amount 
of regi ession will gi\e a tleaier idea of its nature than that which 
comes from its calculation from formulae. 7 ’his will .also seive to 
explain the apparent parado.v of the tuo regression equations such 
as_v = Tx and x — ry, for just as there is an uiiccruinty vaiying 
with the amount of regression in predicting an .v value from ay, 
so also there exists a similar uncertainty in predicting ny value 
from an a. 

A reference to scatter diagrams will also serve to reveal whether 
coirelation is linear. We shall sec that although it is usually safe 
to assume that it is so, this is not inv.iriably the case, and the line 
of best fit is then not a straight line.' Although the correlation 
coefficient is a measure of the degree of relationship between two 
sets of measures, it is not directly proportional to the degiee of 
relationship. For instance, a correlation coefficient of • 7 docs not 
represent twice the degree of relationship given by a correlation 
of -35. It is also necessary to interpret the correlation coefficient 
in relative terms. A correlation coefficient of 9 would not be 
high in the case of two ‘paired’ and similar mental tests, whereas 
in determining the degree of relationship between a physical and 
a mental characteristic it would be difficult to find a value of r 
much greater Uutn *5. It is common to speak of a value of r less 
than "3 as low, from -3 to -7 as medium, from -7 to -9 as high, and 
above -9 very high, but without reference to the meaning of the 
sets of measures which have been correlated, such terms may be 
entirely misleading. . 

As we have already noted, the correlation coefficient enables us 
to predict with a degree of reliability which is known (and should 
be allowed for) the most likely value of a variable in one set, when 
that in the other set and the correlation coefficient between the 
sets are known. The diagram and table on page 48 illustrate 
this. 


' See page $6. 
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The Product- Moment Method of Calculating the Correlation Coefficient 
{sometimes known as the Bravats-Pearson Method') 

From the general consideration of the degree of agreement 
between sets of measures or scores arranged so that the average of 
each is adjusted to be zero, which we have seen when we were 
dealing with regression, it is apparent that if there is no measure 
of agreement between one set of measures and another, the sum 
of the products of deviations of corresponding scores (carrying 
appropi iate signs) from the mean, will tend to be zero. If there 
is a tendency lor measures above the mean in one set to corre- 
spond to measures wliich aie above the mean in the other (taking 
the mean as zero), and tiiose below in one to correspond with 
those below in the other, it is obvious that the total of the products 
of the deviations of each score in one set and the corresponding 
score in the othet will be a positive number. Thus, the product of 
the deviations will give an idea of the existence of a positive 
correlation. In the same wav, if the positive deviations of one set 
tend to correspond with negative deviations of the other their 
product will be a negative mimbcr and will give an idea of the 
negative correlation between them. 

The exact formula, known as the Prouvct-Moment or Bravais- 
Pearson formula for the correlation coefficient, which is written 
as r is 

_ 

No, a, 

where x, and x, are the deviations of the respective scores in each 
case from the mean of each set, N is the number of cases, e.g. the 
number of^ptipiis in a class, and o, and are the standard 
deviations ^ the respective sets of scores. If the scores have 
already been standardized by dividing their deviations from their 
respective means by the standard deviations, the formula becomes 

‘ Bravais, ■ French SMtisticuin of the nuietcenth century first used the idea of 
product -moments, and hn work was improved by Gallon. Kart Pearson (iS;?- 
1935)> scientist and statistician, may be regarded as the successor of the latter. 
The name product-moment refers to the products of the moments (or the weights) 
of the scores m relation to their deviation from the mean 
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^ZiZt 

^ ~~K~ 

where 4:, and 4:1 are the standardized scores, and further, if the 
scores liave been normalized by dhiding the standardized scores 
by -\/N, the correlation coefficient will then liecome r = 
where Si and r, are the normalized scores. 

Where the correlation coeffit ient is calculated from the devia- 
tions X, and x„ the means will hardly ever be whole numbers, 
and tlie exact determination of Zx,Xs is apt to be a laborious 
process. When calculating standard deviations we saw' how it w'as 
possible to use an arbitrary or guessed mean which was a whole 
number, and if x, is now a deviation from an arbitrary mean the 
standard deviation 



The formula for the correlation coefficient therefore becomes 



Example. Use the simple formula for calculating the correlation 
coefficient from the data given opposite. 


r 


N Ox o. 


^ __ ^35 

•y/i 1,671 X \/76 i6 

^35 _ 
io8-o X 87-3 

= .6401 
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Example', calculation of correlation coefficient by 

PRODUCT-MOMENT METHOD USING THE SIMPLE FORMULA 


Pupil’s • °oMutfa 

"uPhystcs 1 X - Mean 

Y- 




A’uot- 

X 

Y 

X 

Mean Y 
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y 

X* 

3 ’ 

xy 

1 

38 
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-7 
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49 
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2 

39 

44 

-4 > 

“ 13 

16 

169 

sa 

3 

61 

62 

18 , 
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324 


90 

4 

49 
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-3 

36 

9 

- 18 

5 

29 

50 
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-7 
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49 

98 

6 

s> 

72 
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»S 

64 
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7 
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70 

ai 

13 
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H 

S 9 

70 
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13 
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9 

ay 

O4 
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7 
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49 
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10 

27 

Oo 

- 16 

3 

2s6 

9 
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1 

*9 

74 
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579 
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12 

6I 

Oo 

t8 
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9 

54 

>3 
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0 

44 > 
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9 
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48 

«3 ; 

-9 

169 

81 

- "7 

34 

24 

38 

- t9 1 

- 19 

3(11 

36 ' 

361 

3 S 

57 

04 

»4 ! 

7 

196 

49 

98 

36 

1 

46 

-32 1 

- 11 

1.024 

121 

35a 

^2 

39 

5O 

-4 : 

- 1 

lO 

t 1 

4 

3 « 

to 

7a 

•' 1 

15 

289 

aas 

ass 

39 

ay 

5O 

-M 

- 1 

196 

1 

'4 

40 

27 

3 a 

-.6 1 

-as 

356 

625 

400 


1.707 

a.384 


. Totals 

11.671 

' 7.616 

6.035 


MeanX 

Me«nY 



“43 I “ 57 , 
(roun<led)| (roundcd)l 
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It will be observed that even with only 40 measures in each set 
and the slight inaccuracy introduced by taking whole numbers 
for the mean considerable labour is involved; tables of squares and 
square roots, logarithms and a slide-rule may be used to reduce 
the labour of computation. A calculating machine which w’ill add, 
multiplv, squ.ire (and if possible divide) is of great use where much 
of this work is done. 


X= -5 -4 -3 -2 -I 0 -H +2 *3 +4 +6 Fy 



Fx I 22687643100 40Tot«KN) 


fig I a. 

To avoid this type of calculation it is better to draw a scatter 
diagram of the data to be correlated and proceed as follows, 
(Often the data will have been given in grouped frequencies at 
the start and therefore the grouping of the measures in the form 
of a scatter diagram on squared paper is the obvious next step). 

Here the measures have been grouped into 12 rows and la 
columns. These numbers need not have been equal but s t or 12 
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should be regarded as a minimum, otherwise the grouping will be 
too coarse and as the S.D.s calculated by this method will be too 
large the correlation coefficient will tend to be too small. The 
two sets of measures to be correlated arc denoted by X and Y. 
Convenient arbitrary means are chosen and the deviations of each 
group of measures are given above and below the respective means 
taken as O. The figures in the cells give the numbers (frequencies) 
of the cases with the corresponding X and Y values. 

It should be seen that the separate totals of the X and Y values 
each come to N (the total number of cases). 

Stage I. To calculate the Standard Deviations 

We use the Standard method of grouped frequencies as given 
in Chapter II. 





Frequencies \ 

Frequencies X 

Deviations 

Frequencies 


Deviations 

(Deiiations)* 

X y 


fy 

J\-y 

f fy-y 


fry* 

-i 6 

0 


0 


0 


1-5 

0 

1 

0 

5 

0 

25 

- 1-4 

I 

2 

4 

8 

16 

32 

r 3 

3 

4 

9 

12 

27 

36 

-h a 

4 

3 

8 

6 

16 

12 

f 1 

6 

8 

6 

8 

6 

8 

0 

7 

6 

27 

39 

0 

0 

— I 

8 

6 

- 8 

- 6 

8 

6 

— a 

6 

2 

— 12 

- 4 

24 

8 

-3 

3 

2 

- 6 

- 6 

18 

18 

-4 

a 

3 

- 8 

— 12 

3 a 

48 

“5 

I 

2 

- 5 

— to 

25 

50 

-6 


I 


- 6 


36 

N = 

40 

40 

-39 

- 44 

172 

279 




— 12 

10 

1 

11 
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Ifr-X 

— 12 


t 

N “ 

- - --- —-300 

40 

\ n'J 

1 ™ .09 

11 

— 3 = — 125 

40 


1 = -016 


%fx*X ^J'y*y 

lyit shall need the value X for the numerator 

N N 

of the correlation formula and in this case it is very small; 
-.3 X - .125 -= .037.) 


Zf,.x> 172 
-N =40=- ^-3^ 


279 


N 


40 


6.97. 


= V4-30 - -09 = v/4-2i =r- 2.05 
Similarly -- V ^-97 -0*6 = V ^^‘954 " ^•^4 


Stage 2. To find the sum of the total x andy products 

The frequency (number of cases) in each cell must be multiplied 
by the product of its x and.)' values. This can be done by consider- 
ing each possible product, and flndiug the total frequencies of the 
celb with each value. It is obvious that any cell with a zero value 
for X or y will contribute nothing to the total. The cells may be 
crossed out in pencil as they are dealt with. The total frequencies 
should come to N. 

A table of three columns may be constructed to give respectively 
the possible products (those which are not represented by actual 
cases need not be written down), the frequencies, and the product 
/x * y.y. 
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xy 

/ 

fxy 

0 

to 

0 

+ 1 

3 

3 

+ 2 

2 

4 

+ 3 

I 

3 

+ 4 

I 

4 

+ 6 

5 

30 

+ 8 

1 

8 

-f 12 

I 

12 

+ «5 

3 

45 

K 20 

I 

20 

+ 24 

1 

24 *53 

— I 

6 

- 6 

— 2 

I 

— 2 

- 3 

I 

- 3 

- 8 

3 

- 24 -35 


N = 40 

118 


» Zfxy ii8 

N = 40 = 

After the correction has been subtracted tlie numerator is 
2-95— 037-= 2-913 

a, X Of 2-05 X 2-64 

Another method which is sometimes simpler than the above, is 
to apply the formula 

f = ~ 

2 Ox X Of 


vfbere Ox and o, are the S.D.s as before and b a third S.D. 
calculated as follows: 


D 
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By means of a ruler or straight edge inclined at an angle of 
45^ to the horizontal the number of measuics falling in diagonals 
taken from the top left-hand corner to the bottom right-hand 
corner are considered. (Each diagonal will lie at right angles to 
the line drawn from the top left-hand corner to the bottom right- 
hand corner. The first diagonal containing any mea.sures will be 
that drawn from r = -t- i to .x = — 2 and this will contain two 
measures, the next from^ 0 to x — 0 contains no measures.) 

The measures will read as ft>lIows from the table on page 38; 

201 1686591001 making the correct total of 40. 

B> choosing an arbitrary' mean the S.D. is calculated as before 
and this will supply the value aj for the formula, which is then 
worked. 


jRani Conelatlon 


The product-moment metiiod of finding the correlation co- 
efficient IS undoubtedly the best way for use in scientific investiga- 
tions but w hen the number of cases to be considei cd is less than 30 
the method of ranks is just ns reliable, and in some cases is even more 
so. The r.mk.s for orders of merit) in the two sets of marks or test 
scores are wiitten against the names of the pupils (It 'is usually 
convenient to write the names in order of merit in one subject 
and in a column to the right to add the correct older in the other 
subject with which we seek correlation.) The difference in rank 
is written in the next column and in a fourth this difference is 
squared. This column which contains only square positive 
numbers is then totalled. The difference of rank is called d, each 


difference is squared (rf*) and these squares are summed Zd*. 

If we consider N pupils (or casc.s) it is easy to prove that if N is 
not too small, the sum of the differences of ranks squared which 


N(N*— i) 

would result from pure chance' or probability' would be ^ 


or N( ^ - 0 ( N + 1 ) 
6 


' See Appendix IV. 
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N (N* — i) . 

[As N is a whole number, notice that g is also a whole 

number] 

The fraction of disagreement between two sets of orders of merit 
or ranks could therefore be expressed as follows: 

Sum of the actual difference squared 


Sum of the differences squared which might be expected by chance 


TnTn* - 1). 

If this is a measure of the amount of disagreement of the ranks, 
the measure of the agreement or correlation may be written as 


2d‘ 

[by subtracting fiom unity] 

This correlation coefficient using ranks is WTitten as p (the 
Greek letter rho). 

It is related to r, the correlation coefficient obtained by the 
product-moment or line of regression methods, by the formula: 


r = 2 sin _ p 
o 


UsuaUy this, transformation is hardly worth while. By using 
ordinary tables of sines the method is as follows: ‘ Multiply p by 
30“. Look up the sine of the resulting angle and double it. This 
gives r. This relation between p and r is only true on the 
average of many occasions. 

‘ The anale n (ndiam) 180*. ^ x 30^ 

o 




Example : calculation of correlation coefficient by ranks method 


J\'ame 

Rank in French 

Rank in History 

</• 

Ashley 

281 

261 

4 

Ascough 

25 

24 

1 

Beaumont 


I 

2 j 

Clifton 

1 

a 

564 

Champkins 

281 

29 

4 

Evans 

>94 

38 

3424 

Foster 

3>4 

39 

564 

Gill 

38 

6 

1,024 

Gasper 

>31 

74 

36 

Gray I 

281 

114 

289 

Gray II 

38 

>94 

3424 

Green 

1 

4 

9 

Goodman 

38 

1 314 

564 

Harrison 


I 5 

204 

Hawlcv 

334 ! 

! >5 1 

3424 

Hill 

224 1 

1 3*4 


Jackson 

36 ! 

29 

1 

1 49 

LjTnn 1 

5 1 


1 64 

Marriot ! 

164 ! 

! >94 

9 

MacEwan j 

38 

37 ! 

I > 

Norman I i 

3 >i 

' 35 i 

>24 

Norman II 

5 

21 

256 

Nelson 

•'4 

! 33 

4624 

Newham I 

28I 

1 >5 

1824 

Newham II j 

25 i 

1 22 

9 

Peak j 

>64 

' 29 

1564 

Powdril ^ 

25 

1 26I 

1 24 

Pickcrsgill | 

21 

1 24- 

9 

Pillatt ] 

>34 

1 36 

5 (rf >4 

Rivers 

Ihi 

>74 

1 

Robinson 1 

*94 

40 

42*4 

Shaw 

5 

9 

16 

Shrewsbury 

164 

*74 

I 

Stafford 

334 

34 

4 

Thornton 

114 

>>4 


Walker 

74 

3 

204 

Wilcox 

74 

*5 

564 

Wright 

35 


13 

Warkinson 

224 

74 

22 r, 

Wardle 

24 

10 

564 


5.>9oi 
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Zrf* 

P"' J-N{N *'-0 

^ j _ 5190J 

4 X 40 (1599) 

6 20761 

= I X — ' - 

40 X 1599 4 

= I — -486. 

Rank Correlation = -si 


Spearman's Footnde 

A ‘rough and ready’ way of finding whether there is any agree- 
ment V)ctwcen two sets of results in Spe.irinan's ‘Footrule’. It is 
not intended to yield precise mathematical results but it often gives 
a quick method of finding whetlicr any correlation exists, and it 
may be used to prepare the way for more exact investigations. 

Only the gains in rank arc noted, and the losses which must in 
the long run be equal to the gains are neglected. The gains are 
added: Ig. 


The coefficient of correlation* R = i 


_ 

i (N* - O' 


It is rarely necessary to correct the ‘Footrule’ coefficient R into 
r the true correlation coefficient but tliis can be done by using the 
formula 


r = 2 cos - ( I — R) — I 
3 

Here is an example using the data given overleaf; 


R = I - 




i (»* - 0 
6 (>87 ) 
1599 


* In this CMe ’tgreemeat' rm^t be • better word. 





Example: showing the itse of spearman’s footrule. 


.Vo. 

in Class 

Maths. 


!S30 

Eanh in 
Science 

g - 

1 

38 

! 50 

j 25 

3 &i 

'4 

2 

34 

i 23 

1 33 

33 

10 

3 

61 

; 50 

7 i 

264 

•9 

4 

49 


' 16 

24 

8 

5 

29 

1 62 

29 

174 

— 

6 

5 « 

72 

»5 

7 

— 

7 

64 

70 

5 

10 

5 

8 

59 

70 

' 10 

10 

— 

9 

29 

64 

29 

'4 


10 

27 

60 

32 i 

20 

— 

1 ! 

>9 

74 

37 i 

5 

— 

12 

61 

60 

Ih 

20 

>24 

*3 

43 

36 

>9 

J / u 

184 

>4 

'« 

48 

39 i 

28i 

— 


42 

46 

20 

3 ' 

I 1 

16 

46 

70 

» 7 i 

10 

— 

17 

72 

40 

2 

35 

33 

18 

62 

42 

6 

34 

aB 

'9 

33 

62 

27 

'74 

— 

20 

40 

76 

21 

4 

— 

21 

37 

64 

26 

'4 

— 

22 

39 

52 

23 

25 

a 

23 

46 

72 

* 7 i 

7 

— 

24 

7 « 

78 

3 

24 

6 

25 

25 

28 

34 

40 

26 

»9 

36 

374 

374 

— 

27 

66 

80 

4 

1 

— 

28 

73 

78 

J 

24 

'4 

29 

52 


‘4 

20 

6 

30 

28 

46 

3 ' 

3 ' 

— 

3 ' 

53 

64 

>3 

'4 

1 

32 

20 

64 

38 

'4 

— 

33 

56 

48 

12 

284 

.64 

34 

24 

38 

35 

36 

J 

35 

57 


I 1 

4 

—■ 

36 

I 1 

46 

394 

3 ' 

— 

37 

39 

56 

23 

224 


38 

60 

72 

9 

7 

— 

39 

29 

56 

29 

224 

64 

40 

27 

32 

324 

39 


g. Total 

= 187 
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A little consideration of the nature of regression lines will give 
us a clearer idea of the meaning of correlation than will come from 
an uncritical acceptance and use of the product-moment formula. 
It is sometimes thought that a correlation coefficient gives an 
exact measure in terms of a fraction or percentage of the agree- 
ment between two scores. It is indeed true that a correlation 
coefficient will give us a clue to the common elements which arc 
contained in the scores As we h.ivc seen by drawing lines of 
regression in scattergrams a correlation coefficient gives us an 
idea of the reduction of error iu picdicting scores in one test from 
those in another. 

It is easy by using the foimula for probable error to construct a 
table or draw a graph to show the percentage 1 eduction in error 
in making this foi coast. This is known as the forecasting efficiency 
of the correKition coefficient. 

The regression equations are valuable becatise we can calculate 
the most probable values of .v, from and those of from x^. 
There is likel> to b<' a considerable scatter on both sides of the 
estimated values of t, or .\t as can be seen by considering an actual 
scatter diagram. 

The Probable Erroi of the estimated x, = 6745 Oi \/i — 

The Probable Erior of the estimated .r, = -6745 a, v'l — r. 

It can be seen tliat when r ~ i that is when there is perfect 

correlation 1 r* - o and thus there will be no ciror in find- 
ing Xi from A", or x, from x,. 

As r decreases the probable error of the estimation becomes 
greater. 

\/ 1 — r* is called the coefficient of alienation (Kelley), 
and is useful in that it gives us an idea of how high r should be for 
satisfactory prediction. 

When r = .1 the prediction is only (-005) better than pure 
chance. With r of *8 we are only 40% bettei' than pure chance 
and with r — -95 only 89% better off! 
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Ftg 13 The nrdmatet (vertical distances) gixe the forecasting efhriency for 
various talues of the correlation coeflicient 


Correlation 

Forecasting 

coefficient 

efficiency 

r 

V 

•00 

•O 

•10 

•5 

•20 

2-0 

.30 

46 

.40 

8-4 

•50 


•60 

20-0 

.70 

28-6 

■80 

40-0 

.90 

56.4 

•95 

68-8 

I-OO 

100 
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It can be seen that unless r has a value of at least -8 the fore- 
casting efficiency will not be above 40% ( *(,ths) and therefore it 
will be of little value. With a correlation coefficient of -3 the fore- 
casting efficiency is less than 5% or a twentieth. 


The Correlation of Three Variables 

Sometimes we have three sets of correlation coefficients by con- 
sidering three sets of vari.ibles, or attainments taken in three pairs. 
It may be necessary to find the correlation between any two of the 
variables supposing that the thiid were kept constant. Such a 
case would be to find the correlation between school attainment 
and estimations of ch.tracter with intelligence kept constant. 

The partial correlation formula is as follows 

_ r,, - r„ \ r„ 

V(» 

Tn, fji and r„ are tlic correlation coefficients of the scores i, 2 
and 3 taken in pairs, r,,., i.s tire correlation coefficient of scores 
1 and 2 with 3 kept constant. 

As a further example wc may consider correlation of age, 
height and weight. Let us call them x years, inches and 
< lb. respectively. We can correlate them in pairs and find r^ 
and r,,, but each of these correlations is afl’ceted by the third 
variable. 

Tlie formula enables us to calculate the correlation between any 
two, say X and_)', left uninfluenced by the third. 

In this case the correlation coefficient 


For convenience of reference we give the standard error now but 
this will be more fully explained in a later chapter. 
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Standard error = 


I — r* 

"VN 


where r is the particular correlation coefficient which is 
required. 


Tetrachoric Correlation 

Tetrachoric Correlat/on means a method of correl.ition using 
four groups (as tlic Greek name implies). In these methods wc 
have data limited to the number of cases or the proportion of 
cases in each of two categories in each set. 

Suppose we have a number of pupils who are given tests in 
science and mathematics. We can divide them into four groups. 

a = Number above average in both science and mathematics. 
b = Number above average in science but below average in 
mathematics. 

c = Number below average in science and above average in 
mathematics. 

i = Number below average in both science and matliematics. 


Science 


Mathematics 



Pearson’s coefficient is 


p = cosuie 


( 

' ad -f- -y / bi 


■ff)’' 


The value of the expression within the bracket is calculated. This 
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is multiplied by 180° and the cosine of the resultant ‘angle’ found 
from the tables. 

It will be seen that the total number of cases (e.g. the number of 
pupils) ~ a b + c )- if we can disregard pupils who are 
exactly on the average line.* 

Example: In an examination t.iken by 40 candidates 6 were 
above average in both .science and mathematics, 14 were above 
aveiage in science and below in mathematics, 14 were below 
average in science and above in mathematics, 6 were below 
average in science and in mathematics. 


Mathematics |- 

1 

i 


.Science 


I 

14 I 6 



I 


By the formul.i 


P 


~ cos 

cos 


/ v'sG 
V\ U 16 -r v 
Gx 180 
20 


) 


180° 


= cos 54 ’ 
= -5878 


A modification of the above is sometimes useful as it gives a 
conservative (or even modest) idea of the intensity of association. 
It is known as the coefficient of coUtgation u, and is due to Yule. 


' When the divisions (1 e., the dichotoouc lines) arc st the respective meiuis the 
formuJa simplifies itself, to 


. {ad — be) . „ 
p sin — i.,- — 360 


i.«. p ' 


an {ad — be) 


N* • N* 

where N --a total number of measures = o + 6+ r4-d. 
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y/ ad — ■^bc 
\/ad 4 '\/bc 

Using the same data as above: 

_ y / 196 — V36 _ *4 ~ 6 
^ y / 196 + y /^ 14 + 6 20 


The Method of Unlike Signs due to Sheppard 

U = percentage of ‘unlike’ signs (that is, of cases with one score 
above and one below average in botli tests) 

“ A -f" c 

L = percentage of ‘like’ signs (that is, the sum of cases with botli 
scores above or below itveragc respectively) 

L 4- U = 100 (as U and L are percentages) 


U 


Sheppard’s coefficient s — cos f - j tr 

JL» -f- U 


180U 

. s = cos 

100 

= cos 1-8 U 


Thus, the percentage of unlike signs must be fou^d, multiplied by 
1.8 (i.e. I) and the cosine of this number regarded as an angle in 
degrees found from tables. 

In the example used for Pearson’s formula above, the percentage 
of unlike signs is x 100% = 30% 

r cos (i-8 X 30)" = cos 54® 

= .5878 

which is precisely the coefficient which we found above. (This 
does not always happen but usually there is close agreement.) 
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Coefficient of Association due to Yule 
From our tctrachor tabic wc can measure the intensity of 
association between two sets of data using Q, the coefficient of 
association 


^ _ ad — be 
~ ad~+Tc 


Using the same data as above 
O = ‘-4 

^ 14 X 14 + 6 X 6 

196 — 36 _ 160 _ 

~ 196 -1 36 ~ 232 ~ ® 

This method produces a generous estimate. 


Biserial correlation 

Sometimes it is necessary to correlate sets of data when they are 
given in the form of t%\o mutually exclushc groups in respect of 
one set and in numcric.il .scoics in respect of tlie otlier. Such 
dichotomies in the first set would be given by sex differences, 
married and unmarried persons, trained and untrained teachers, 
graduates and non-graduates, children of a particular age group 
attending school and those of the same age who have left school, 
etc. The following example taken fiom a study of a hundred boys 
and girls, sixteen to eighteen years of age who have left school 
and another group remaining at school will illustrate this. ‘ 

The biscrial coefficient of correlation is given by 

(M,. 

' By EIwochI SofiM. ‘A Study of one hundred boys and girU eixteen to eighteen 
year* of age who have left school and a similar group remaining at school’ (according 
to size 01 {amiiiesl. The correlation between ’Suying at School' and size of 
family is only -17^ 
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Fig 14. Non-linear correlation 


Thus both high and low perseverators would tend to have low 
character scores, and the highest character scores would be 
associated with moderate perseveration. 

In this case wc use a correlation ratio t) (eta) which is given by 



where a, is the standard error of estimate (the standard deviation 
of one of the sets of measures) and a,, is the standard deviation 
of the other. 



CHAPTER IV 


THE PROBLEM OF ERROR 

But to us probability is the very guide to life. 

BISHOP BUTLi.R — Analogy of Religion 

\s the results which we have obtained so far assume that we 
ZA have been handling very large numbers of cases it is nccessar}' 
X Ato consider what liappens when we make our experiments 
with smaller samples. It is obvious that the statistical laws which 
we use will be free fiom eriors to an extent related to the number 
of cases which we can investigate. A very simple e.vample w ill 
suffice to show this. If we toss a penny a sufficiently huge number 
of times, say 100,000, we should expect the ratio of heads to tails 
to be I to 1 with a very' tiny possible error in the i : i ratio. If 
we toss the coin only 10 times it may happen tliat we get 3 heads 
and 7 tails Viut in the case of 100,000 trials the chances of getting 
30,000 heads and 70,000 tails are so cxceedinglv remote as to have 
no statistical interest for us. In other words as the number of 
trials gets larger and largei the ratio of heads to tails approaches 
nearer and nearer to Us true limit. 

The problem before us now is to try to find just how' reliable are 
the results of our investigations on various numbers of cases. 
An ordiniury school class may contain no more than 25 or 30 
children. Again, when we have to deal witli rather lengthy 
investigations it is necessary to limit the number of cases con- 
sidered in order that the 1 esearch can be completed in a reasonable 
time. 

Thus, all the investigations on a metrical basis which we make 
in psychology and education will have to be qualified by an 
estimate of the size of tlie error which is likely to arise, and we 
shall have to consider its size in relation to the size of other factors 
concerned, as a correlation coefficient, for instance. In the 
analysis of variance, that due to error may be compared with the 
variance due to other causes under consideration. 


B 


S7 
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It is clearly impossible to take such large samples, in normal 
procedures, to ensure that each sample is a true cross-section of 
the entire population. Suppose that we have been finding the 
correlation coefficients between two sets of scores in subjects A and 
B and that we have been able to continue our investigations with 
a lai^e number of similar groups of children. We should not 
find tlie correlation coefficient to be quite the same in any two 
groups of children owing to errors of sampling; we should find a 
central tendency in all the correlation coefficients and it would be 
apparent that the correlation coefficients would satisfy the normal 
law of distribution. To find the probable error we should want to 
know how far from the mean or central value of the correlation 
coefficient is the line which divides onc-half of the coefficients 
from the rest. If the dispersion were great compared with the 
value of the correlation coefficient, that is, if the P.E. w'cre more 
than a small fraction of the correlation coefficient, we should 
regard the latter as being unreliable. 

Investigators trained in the physical sciences tend to reject any 
results where the correlation coefficient is not more than four times 
greater than the probable error, but a less rigorous attitude has 
prevailed in psychological investigations and results which arc 
no greater than three times the probable error arc accepted as 
being significant. Even these should be treated with great 
caution and the investigation should be continued with further 
critical exploration of method and data. In writing down a 
correlation coefficient or other result we should therefore add the 
value of the probable error. 

Probable error is another term for quartile deviation or the 
semi-interquartile range. Usually, however, the term quartile 
deviation is only applied to simple measures and probable error is 
used with derived or secondary measures, as for instance standard 
deviation or the correlation coefficient. The obvious w'ay of find- 
ing the probable error would be to arrange the measures in order 
or to count them and to take half; but more often tlie probable 
error is found from the standard error (or deviation) and the use 
of the fomnila P.E. = •6745<^ (i-c- *6745 x S.D.). 

It may be well to examine the meaning of the word probabU. 
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If wc say that Tt is probable that it will rain tomorrow’ wc 
really mean that the chances that it will rain are more than 
those that it will keep fine, that is to say that the chance is per- 
ceptibly greater than a ‘50-50’ chance. The expression ‘probable- 
error’ though time-honoured Is misleading and really means half 
the measures on each side of the central point. (A rough approxi- 
mate is that probable error = | X standard deviation.) ‘ 


Probable Error of Mean = -6745 


Probable Error of Standard Deviation --- -6745 rr 


Probable Error of Correlation Coefficient r = -6745 


X - r« 


The reader must not be misled by the use of the word probable,* 
and the formulae simply give the chances that the mean or other 
derivatives will lie witliin a certain distance of the true value. 


In the case of the mean the chances that it lies between -f pro- 
bable error and — probable error are i to i. The chances that 
it lies inside the limits become greater as the limits increase: for 
instance 


between — P.E. and P.E. the chances are i to 1 

— 2 P.E. and r 2 P.E. „ „ „ 4 5 to i 

— 3 P.E. and )- 3 P.E. „ „ „ 21 to 1 

— 4 P.E. and + 4 P.E. „ „ „ 142 to i 

— 5 P.E. and + 5 P.E. „ „ „ 1310 to i 

— 6 P.E. and + 6 P.E. „ „ „ 19,200 to i 


’ These matters will become clearer when the chapter on the Normal Curve is 
read. It should be remembered that the relation between standard and probable 
errors only holds if normal distribution of the erroni can be assumed, 

' The popular treatment of probabilit}* in terms of ‘odds for’ and ‘odds against’ 
should be qualified by a more systematic mathematical treatment. Here ‘certamty* 
IS denoted by a probability of 1 and an ‘impossibilin-’ by a probability of o. The 
mathematical probability of an event lies between o and i and may be expressed 
as a fraction, decimal fraction or a percentage. 

If the probability that an event will happen is given by the fraction (i.e . t 
chance m x and not 1 to x) the probability agamst the event happening will be 
1 - - or the fraction 
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The chances that the mean lies outside these limits is given by 
interchanging the figures in the two right-liand columns. 

The chances expressed in tcinis of standard deviations: 


± P.E. == i -67450 


Frtqutncifs ofdtvta- 
itons outside these 
limits 
2 X' 25% 

2 i5-9"i 

2 < 2-28“„ 

* ' ‘ a.J o 

2 n -0032 


Oddi aqainst devia- 
ttemi Jnllirig outside 
these limits 

1 to I 

2 to 1 
21 to I 

37 '> ' 

15,600 to I 


± o 
± 20 
± 3® 
-- 4*^ 


The standard eiror (or standard dc\*iation) docs not toil us hoss 
much our result is in error but lalhcr the chances that the icsult 
has an error of a particular magnitude. 


Summary oj the Probable Lirois of Correlation Corfftcienls 
r i.s the correlation coeflicicnt found by the product-moment or 
line of regression. 

p (rho) is the correlation coeflicicnt found by rank method 

I -- p* 

P.E. = .706 - - 

V'N 

R is the correlation coefficient found by Spearman's footrulc oi 
the ‘gains in rank’ mcUiod 


P.E. 


■43 

VN 


It will be noticed that in each case the denominator contains 
y'N, the square root of the number of cases considered. The 
consequence of this is that if we quadruple the number of cases 
(e.g. consider 120 pupils instead of 30) the probable error is 
reduced by a half, and it will be reduced to a third if the number 
of cases is multiplied by nine. 
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e.g. (a) find the P.E. where r == -g, N = 36. 


•6745 (i - -9*) 
V36 
16745 ■< -19 
6 


-= -0213 
r = .9 - 0213 

In writing the probable error in this way it must be rcrncmbcicd 
that the P.E. is given as a probability and not as an :ictuality. 

{b) find the P.E. where r — •.}, N -- iG 

p£ 16745 (i - ■4’'> 
y'lC 

== -6745 ,< -84 

.4” 


= 142 

f = '4 i -142 

Here the P.E. is more than a third of the correlation coefficient. 
The latter cannot therefore be considered reliable or even 
significant. * It svould have been better in the investigation to have 
used all po.ssiblc means to lake a greater number of cases than 16. 

* The nature of the ratio between a tocffidcnt and its standard error or deviation 
must be carefully considered. The Rgurt which i.s taken, really me.ins that the 
chances that the coefficient has no significance arc reduced to such an extent that 
we have reason to believe that there is good evidence of significance. 'I'herc is no 
case of conclusive proof. As a figure equal to twice the standard dciiation only 
occurs about once in 22 cases Fisher sum;ests that this may be regarded as signifi* 
cant. As probable error is about { x standard error or deviation, FisherV suggei* 
tton is that 3 x probable error would be a sitpiificant e|u3ntily. 

McCall Im suggested a ratio of 2.7H x standard deviation (i.e. about 4.17 pro* 
bable error), but this is larger than we usually find tn psychological and educational 
experiments even when other considerations lead us to believe chat there shoidd be 
si^tficance and some notable degree of correlation between our figures. 

Peters suggests that a figure somewhat less than that of Fisher s may be per> 
mitted. He takes the point on the proliafaihty curve where it bends to a maximum 
degree as the distribution thins out to a tong tail This gives a value of 1.73 X S.E. 
or 2.6 X P.E. and for this he proposes the term uorktm ratto. 

In each case the student should fortify himself by finding what is the extent of the 
ptdbebility from the tables of the integral of the normal probability curve and it 
ahouM be kept in mind that probability does not imply certainty. 
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P.E. of tetrachoric r where the dichotomic lines are at the means 

•6745 (zttV i - r ») r (a + rf ) (c + ^) ~1 j 

Vn L 4N* J 

and where true r — o 


P.E. = 

The probable error does not give a very good estimate of the 
reliability of r when N is small and r is laige. Accordingly Fisher 
has suggested that r .should be replaced by its hy-perbolic arc- 
tangent tanh ' T which he calls c* and for which he provides 
tables. 


:®745w 

a-y/N 


tanh - ‘ r = ^> r= J [log, (i + r) — log, (i — r)] 


= [log,, (i + r) - log,, (i - r)] 

Many experimenters would feel that results obtained by investiga- 
tions with less than 25 cases would be so unreliable as to be of 
negligible worth and wheie any rigorous research was undertaken 
a hundred cases or more should be considered. 


Test Reliability and Test Length 

If, after a sufficient interval, a test is applied again under 
similar circumstances there should be a high degree of correlation 
between the two sets of scores. Moreover, if the test is a good one 
it should be largely independent of the qualities and skills of those 
administering it. 

If a test is reliable it can only be so if it is thorough and this will 
depend tP a large extent on its length. If tests are supplied in 
double form so that there arc two pai-allel tests, a re-test with the 
second set should produce results with a high d^ee of correlation, 
that is, upwards of -g, with the first set. When two similar tests 
are not supplied, a single test is converted into two by taking the 
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odd-numbered questions as a shortened first test and the even- 
numbered as the second test. By shortening the test its reliability 
is also reduced and theiefoic it is necessary to have some means of 
predicting the reliability of a test if it v\ ere lengthened. 

Suppose r is the con elation coefiitient of the results of the two 
lialvcd tests. Then if R is the correlation coefficient between the 
complete gi\en test and an imaginary’ one of similar type 

R = 

I -f r 

In a genera] ca.se, uherc a te.st is imagined to be lengthened n 
times we m.iy use the Spearman-Btoicn prophecy-formula-. 

R ^ 

1 t (n - i)r 

(of u hich the formula for the doubled tests is the simplest ca.se). 

^^e can calculate the reliability or the limits of variation of 
individual scores when we know the rcli.ibility coefficient. 

Probable error — 6745a \,/i — r’ 

e.g. if there is a correlation t>f .95 between intelligence tests and 
the standard deviation of the intelligence quotients is 15 then 

P.E. of I.Q,. = 6745 X 15 > -- - 95 * 

3 t 

This means that about half the people taking the second test will 
have I.Q,.s which differ from those which they obtained in the 
first test by little more than 3 points. By considering the way in 
which the expression -y/ 1 -- r* becomes larger as r becomes smaller 
the student will sec how rapidly the probable error increases as 
the reliability coefficient r drops below -9. Unfortunately an r of 
■95 is exceedingly rare. It should be added that the reliability of 
a test w'ill appear to be lower than it can be taken to be, if it is 
given to groups which are too homogeneous and therefore do not 
permit proper sampling both in respect of age and abilities. The 
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difference in reliiibiiity as given by tests with tsso groups of 
different ‘spread' (i e. hornogcncits oa heterogeneity) is given by 
the formula 



where R is the icli.ibilits to be expected with a gioiip with 
standard desiation of I.t^. a. and 1 the reliabiliu vith a group 
of S.D. a,. 



CHAPTER V 


THE NORMAL CURVE OF DISTRIBU- 
TION AND ITS USES 

M ost students arc familiar with tlic well-known bell-shaped 
curve and we have already noticed it when we w'crc 
considering the distribution of measures with respect to 
a central tendency. It is now convenient to consider more care- 
fully the nature of this important curve. For the reader who can 
deal with simple calculus some of its mathematical properties have 
been w’orked out in Appendix III. For the purposes of the present 
section it will suffice if wc examine the shape of the curve and 
know the meaning of the heights of various lines draw n s'crtically 
in it and the .significance of areas bounded by tlie curs e and cut 
off by such lines. The quantitative aspects of such lines and areas 
will be given in simple tables. The curve is sometimes called the 
Laplacian or Gaussian curve in honour of Laplace and Gauss who 
respective!) used it in their work on probability. For reasons 
which will be apparent it is also called the probabtlilji curve or 
curie of error. One of its most fruitful early uses was to deal with 
experimental errors in astronomical observations. 

A word of warning must be uttered concerning the use of the 
so-called ‘normal’ curve. Too often in the past the adjective 
‘jiormar has been misused. The distribution of the velocities of 
molecules of a gas, or that of the quantitative measures of errors 
in respect of many physical ob.servations maj> under certain conditions 
where there are no biasing factors conform to such a curve. Even 
here the mathematical theory of pure chance in the distribution 
usually preceded any attempt to check its validity, w'hich has to 
be assumed without experiment in many cases. In the case of 
'mental measurements’ the matter is much more difficult. We 
have no theoretical basis for expecting such distributions, and in 
fact factors can be imagined which may cause skewing. In an 
intelligence test scale wc are not dealing with the phyncist’s 
'class A* measures such as length, speed and mass. We can obtain 

66 
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a length of 130 cm. by adding one of 70 cm. to another of 60 cm. 
We cannot obtain an I.Q,. of 130 by adding one of 70 to another 
of 60. Each I.Q,. must be referred separately to an arbitrary 
scale. It would be foolish to assume that there is a fundamental 
‘law of normality’ which applies to most sets of educational and 
psychological data. Most of the groups and samples with which 
we have to deal in psychological research are only defined in a 
vague and ambiguous manner and the degree of homogeneity in 
traits other than the one Avhitli we are considering is seldom 
sufficient to eliminate their effect. 

It is impossible to talk about the form of a distribution being 
normal with any meaning unlc.ss we specify the type and classifica- 
tion of tlie individuals concerned. 

Certain physical characteristics such as weight show reasonably 
good normal distril)ution for individuals of the same sex, race, 
age and height, but even here the curve is negatively skewed, as in 
‘normal’ times excessive overweight is more common than 
excessive undci'wcight. The use of the word ‘normal’ w'hether it 
describes the times in which we live, a person’s bchasiour, or a 
distribution needs careful consideration. This is not to despise 
its use in educational research, but the early use of the distribution 
to deal with errors and deviations from a mean is still the most 
useful. A curious example of ‘circular reasoning’ sometimes takes 
place with respect to intelligence tests. Such tests are usually 
devised to give a ‘normal’ distribution of the scores with certain 
population classes. It is to be expected tliercforc that when they 
are applied to the testing of similar population classes the distribu- 
tion should be normal.* The symmetrical bell-shaped curve is 
useful because it is susceptible to easy matliematical treatment, 
but here again we must not be ensnared by the attempts which 
mental testers have made to give numerical assessments of 
intelligence along a scale of numbers. This scale has none of the 
properties of a graduated rule or length. The boy with I.Q,. 130 
is not twice as intelligent as a boy of I.Q. 65. There is in fact 

* A distribution which does not conform to the ‘normal curve’ may be quite 
normal in the usual sense. In educational measurements and calculationa die wor^ 
‘normal distribution’ refer to the curve. 
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hardly any means of comparing these individuals; the first able 
to benefit by Grammar School teaching and the other practically 
a moron. The ‘man in tlic stiecl’ who said th.it the first was 
‘a thousand time.s as intelligent* as the second would, in spite of 
exaggeration, ha\-e the germ of truth in him. 



/ ■ 1 .. 


I 



The mean, mode and median of the curve arc equal and arc 
marked y, on the central axis of 7 , about \\ Inch line the cur\'c is 
symmetrical. The area of the curs e represents Uic total number of 
scores or measures which are distributed. By drawing vertical 
lines we can measure the areas enclosed by the curve which are 
cut off by tlicm. These represent the numbers of scores w hich are 
beyond or within a certain value of the score. 

If there is good dispersion of the scores the curve is wide and 
well-rounded, but if, on the other hand, there is not much 
dispersion and the scores deviate but little from the mean, the 
curve is thin, .shai p and pointed. 

It will be ob.scrved that at points on tlic curve, known as points 
of inJUxion, the convex shape of tlic top part of the curve gives way 
to the concavity of the lower part of each side. 77iese points are 
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at a distance a (standard deviation) oji each side of the central 
point. 

The curve is said to be asymptotic to the axis of x (that is the 
horizontal b.ise linej. This means that the curve approaches this 
line if it is suflicicntly c.\tcnded at both sides It is said to meet the 
line ‘at inhmtv’. Tlie standaid deviation a is a convenient unit 
for measuring distances along the r axis. Exceedingly little of the 
aica ol the cutvc remains at dist.inccs greater than 3a on each 
side of the central line. 

It is convenient to reduce all distances along the x axis to 
\tgma-umls by dividing the x distances b> cr. 



The amount of the area enclosed by the whole curve lying between 
verticals at distances of a on each side of the central line is 
08 - 26 %. 

That enclosed between veiticals at distances of 20 on each side of 
the central line is 95- 44%, 

and that enclosed between verticals at distances of 3(7 on each 
side of the central line 99- 75%. 

The following table gives the proportion (percentage) of tlie 
total area under the normal curve between the central line (mean 
ordinate) and an ordinate (vertical line) at any given distance (in 
sigmas) from the mean. 




70 


STATISTICS IN SCHOOL 
Table I 


PER CENT OF TOTAL AREA I NDER THE NORMAL CURVE 
BETWEEN MEAN ORDINATE AND ORDINATE AT ANY 
GIVEN SIGMA-DISTANCE FROM THE MEAN 
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3 0 49 87 

3 .4 49 98 

40 49 997 

S.O 49 99997 


The next table gives the ordinates (the vertical heights) under 
the normal curve at various x distances (in terms of standard 
deviation) firom the mean. The ordinates arc given as proportions 
of the mean ordinate, that Is, the greatest height of the curve. 
Such a table is useful if we desire to find the frequency at a certain 
point, e.g. the number of cases with a certain score. 
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Table II 

ORDINATES UNDER THE NORMAL CURVE AT VARIOUS SIGMA- 
DISTANCES FROM THE MEAN (ORDINATES EXPRESSED AS 
PROPORTIONS OF THE MEAN ORDINATE) 


K 

a 

00 

01 

02 

03 

04 

05 

06 

.07 

08 

.09 

0 0 

1 0000 

1 0000 

9098 

9996 

9992 

9988 

9962 

9976 

9968 

9960 

0 1 

9950 

9940 

9928 

9916 

9903 

9888 

9873 

9857 

9639 

9821 

0 2 

9H02 

0782 

9761 

9739 

9716 

9692 

9668 

9642 

9616 

9588 

0 3 

9500 

9531 

9501 

9470 

9438 

9406 

9373 

9338 

9303 

9268 

0 4 

9231 

9194 

9156 

9117 

9077 

9037 

8996 

8954 

8912 

8869 

0 5 

8825 

878) 

8735 

8690 

8613 

8S96 

8549 

6501 

8452 

8403 

(1 b 

8153 

H.R)2 

6251 

8200 

8148 

8096 

H043 

7990 

7936 

7882 

0 7 

.7847 

7772 

7717 

7661 

7605 

7548 

7492 

7435 

7377 

.7319 

0 8 

72«2 

7201 

7145 

7086 

7027 

.6968 

6909 

6849 

6790 

6730 

U » 

b«70 

8610 

6550 

6489 

6429 

6368 

6308 

6247 

6187 

.6126 

1 0 

6085 

. 6005 

SO44 

5883 

S823 

S762 

5702 

5641 

5581 

5521 

1 

54bl 

5401 

5341 

5281 

S222 

$162 

5103 

5044 

4985 

4926 

2 

4868 

48i>9 

47.S1 

469.1 

4636 

4579 

4S21 

4464 

4408 

.4352 

3 

449fo 

4J40 

4185 

4129 

4075 

4020 

.1966 

3912 

3659 

3806 

1 4 

3753 

37ul 

3649 

3597 

3546 

3495 

3445 

3394 

.3345 

.3295 

1 5 

,3247 

1198 

3 ISO 

1102 

3055 

3008 

2962 

2916 

2870 

.2625 

6 


2736 

2692 

2649 

2606 

2vS63 

2521 

2480 

2439 

2398 

7 

2.158 

2318 

2278 

22 49 

2201 

2163 

2125 

2088 

2051 

2015 

8 

1979 

1944 

1909 

1874 

1840 

1806 

1773 

1740 

1708 

1676 

9 

1845 

1614 

1583 

1553 

1523 

1494 

1465 

1436 

1408 

1381 

2 0 

.1353 

1327 

1300 

1274 

1248 

1223 

1198 

1)74 

1150 

1126 

2 1 

1103 

lUbti 

1057 

10.45 

1U13 

0991 

0970 

0950 

0929 

0909 

2 2 

0SH9 

0h7o 

09SI 

0832 

0814 

0796 

0778 

0760 

0743 

0727 

2 3 

071U 

f>b94 

0678 

0662 

0647 

06.12 

0617 

.0603 

0589 

0575 

2 4 

0581 

0548 

053S 

0522 

0510 

0497 

0465 

.0473 

0462 

.0451 

2 5 

0419 

0429 

0418 

0407 

0397 

0387 

0378 

0168 

0358 

0349 

2 6 

0341 

0332 

032,4 

03 IS 

0307 

0299 

0291 

0283 

0276 

0268 

2.7 

02<il 

0254 

0247 

0241 

0234 

0228 

0222 

0216 

0210 

0204 

3.H 

0198 

0193 

0188 

0182 

.0177 

0172 

0)67 

0)63 

0158 

0)54 

2.9 

UU9 

.0145 

0141 

0137 

0133 

.0129 

0125 

0122 

.0118 

.0115 

3 0 

.0111 











The area tabic will prove to be the more useful, however, and 
here are some of the uses to which it may be put. 

I . It is consulted if we wish to find the number or proportion 
of cases in a normal distribution which lie on one side of a point 
along the .scale. 

Example'. An I.T. set of scores have a mean of 100 and S.D. of 15. 
Find the percentage of scores which lie above 120. 

This score of 120 is 20 above the mean 

or m terms of sigma-scores — or i'S33 above the mean. 
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From the table we see that * value of 1-33 gives a percentage 

o 

of 40 82‘for the area between the mean ordinate and the given 
one. (By interpolation ^vc get the value of 40-88 for 1-333.) 

As the cur\c is symmemcal about the mean ordinate 50% of 
its area lies abo\e (to the light of) this line. 

Thus the percent. .ge of Mores which lie above ,120 is 
(50 - 40-88; 9-12“,,. 

To con\ert this to an actual number vve should multiply the 

total number of c.'ises bv ^ . 

100 

2. It is easy to extend the aliove to find the percent.ige of or 
number of cases which lie between two points on the scale. The 
process outlined in ( I'l is lepeated in respect of botli points and 
a simple subtraction gives the required icsult. 

3. The t.ible may also be used to find the point on the scale 
above or below which a given number or percentage of the cases 
in a norm.il distribution lie. This is the reverse of { i). 

Suppose 1 5% of the cases he above the required point. Then, 
considering only one side of the tur\e (50 -- >5) "a or y)% of the 
cases will lie between it and the central line. We thercfoie se.urc.h 

in the body of the table to find an ' value corresponding to this. 

The value b therefore 1-036 (by interpolation) and if o = 15 
tlie rcquiied point is 1 03G 13 along the x axis. 

If the mean i.s given by 100 this point will be 100 f 1-036 x 15 
= "5j- 

This type of calculation may be extended to find the x distance 
on each side of the mean which cuts off a certain middle propor- 
tion of the cases. We can divide tliis proportion by a half and 
work on one side of the mean only, thus taking advantage of the 
symmetrical properties of the curve. 

4. The curve may also be used for finding certain probable 
values and for obtaining an understanding of what is meant by 
probable error. There are various aritlimctical way^ of expressing 
a probability. If we say that ‘it will probably rain tomorrow’ we 
mean that the chances of rain are greater than those that it will 
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keep fine, that is, slightly more than the i : 1 or even chance. 
The probability is rather more than ^ or 30%. In the case of the 
‘normal curve’, probabilities are measured as ratios or percentages 
of a particular area compared with that of the whole. If the ratio 
or percentage is a small one the probability is correspondingly 
small. For example, a probability of would be i chance in 
40; a probability of 98 % would be 49 chances in 50. St.itistics is 
full of prob.ibilities and the student should try to think in these 
terms. Probabilities are not certainties but 1 efer to what is likely 
to happen in the long run and with a sufficiently large number 
of cases. Even though the chances that an event will happen or 
that a result is significant may be very much greater tlian the 
chances that the event will not happen or that the result is not 
significant, there is still an uncertainty. Many of the so-called 
‘laws of science’ arc to be thought of as being true to the extent 
of a large probability basctl on the results of a great number of 
observations. Pi obabilities of a sequence of chance happenings 
are subject to the rules of the behaviour of a single happening 
and no further prediction can be made. For instance, if we toss 
a penny four times and four successive ‘heads’ result, the proba- 
bility that we shall throw' a ‘tail’ on the fifth toss is no greater nor 
less than it was at tlie stail. It is still an ‘even chance’, i.e. a 
probability of J or 50%. 

Suppose that the curve represents ‘eiTors’ or deviations from 
the mean. If we divide the area of the curve into halves by taking 
the ‘middle’ half of the scores we shall have 25 % of the measures 
on each side of the mean line. The chances are even that any 
, measure selected at random will lie within the ‘middle’ half of 
the scores. 

We can find the distance of tlie x value which marks the 
boundary of the 25% of area by consulting the table. A rough 

value is ^ is -67, but by interpolation or by consulting a book of 

' statistical tables we can obtain a more accurate value. We find 
that the chances are even (tlie probability is J) that any measure, 
score or error selected at random from a normal distribution will 
deviate from the mean by more (or less) than •67450. 

1 9 



STATISTICS IN SCHOOL 
Table III 

PER CENT OF TOTAL AREA UNDER THE NORMAL CURVE 
BETTVEEN MEAN ORDINATE AND ORDINATE AT ANY 
GIVEN P.E. DISTANCE FROM THE MEAN‘ 


X 

vx 

on 

Ul 

.02 

03 

04 

05 

Ob 

07 

06 

09 1 

1 

0 

no 


27 

54 


81 

1 08 

1 

35 

1 61 

1 88 

2 15 


! 

2 09 

2 

98 

3 2.3 

3 

49 

3 76 

4 

03 

4 30 

4 5b 

4 8'1 

5 10 t 

o 

5 37 

4 

b3 

$ 90 

8 

16 

b 4i 

6 

70 

b 96 

7 2'i 

7 49 

7 7v5 1 

3 

8 02 

8 

2*1 

« 54 

8 

81 

0 07 

9 

111 

9 

9 8.5 

10 n 

U» 37 , 

4 

10 h3 

10 

89 

11 15 

11 

41 

11 67 

n 

03 

12 18 

12 41 

li 69 

12 95 . 

5 

U 20 

13 

4»i 

13 71 

13 

96 

14 22 

14 

47 

14 72 

14 97 

15 22 

15 47 ! 

b 

14 71 

15 

98 

lb 21 

lb 

4b 

16 70 

lb 


17 18 

17 41 

17 bF4 

17 02 ’ 

7 

18 18 

IK 

40 

18 M 

IK 

Kh 

19 12 

10 

vis 

iO 58 

19 52 


20 20 

K 

2M 


7b 

21' 

21 

22 

21 45 

Jt 

bb 

21 ‘*1 

2i M 

Zi 1*» 

22 56 

9 

22 81 

23 

03 

2J 25 

21 

48 

23 70 

21 


24 il 

24 IS 

24 S7 

24 79 . 

1 0 

25 

7< 

21 

25 41 

25 

64 

25 H5 

2b 

fpa 

2b 2? 

28 49 

2b bH 

JH 69 > 

} 2 

27 i/» 

27 

.10 

27 

27 

7«« 

27 0** 

2M 

to 

2S 1** 

28 5a 

2H 70 

‘JK Ho • 

1 2 

29 09 

29 

28 

29 47 

29 

hit 

29 

!•> 

04 

JO '21 

1»» 42 

.1»» 

10 79 ' 

1 J 

31* 97 

Jl 

1.*) 

31 34 

11 

52 

31 7»i 

31 

67 

'12 oS 

32 2J 

32 40 

32 56 1 

1 4 

3J 75 

32 

92 

ij 09 

33 

26 

33 43 

31 

O'! 

33 7b 

33 91 

34 09 

34 25 1 

t 4 

34 43 

34 

58 

34 74 

34 

90 

35 05 

35 

21 

IS .16 

3S S2 

IS 07 

IS 82 ‘ 

1 

34 97 

16 

12 

38 27 

3K 

42 

36 57 

3H 

7) 

.46 Hb 

17 00 

37 U 

j; 28 

\ 7 

37 42 

'17 

58 

37 70 

37 

84 

37 97 

38 

II 

SH '24 

1H 37 

.16 so 

.1H b.1 , 

2 8 

38 79 

38 

89 

39 02 

39 

15 

39 27 

39 

39 

.99 52 

39 04 

39 7b 

39 86 1 

I 9 

40 

40 

12 

40 23 

40 

35 

40 46 

40 

58 

40 69 

40 80 

40 91 

41 02 j 

2 0 

41 13 

41 

24 

41 35 

4) 

45 

41 56 

41 

M 

4] 77 

41 H7 

41 97 

42 07 ' 

2 I 

42 17 

42 

27 

42 3b 

42 

4b 

42 55 

42 

hH 

42 74 

42 84 

42 91 

43 Oi 1 

2 2 

41 11 

43 

20 

43 29 

41 

37 

43 4b 

43 

54 

4'1 b3 

43 71 

4J 80 

43 88 1 

2 3 

41 99 

44 

04 

44 12 

44 


44 28 

44 

15 

44 4.1 

44 50 

44 38 

44 85 ■ 

2 4 

44 73 

44 

80 

44 87 

44 

94 

45 01 

45 

08 

45 15 

45 21 

45 26 

45 3S 1 

2 5 

43 41 

45 

48 

45 54 

45 

€0 

45 67 

45 

73 

45 79 

45 K5 

45 01 

45 97 

2 fl 

48 01 

48 

08 

48 14 

46 

20 

46 25 

46 

'11 

46 3b 

4b 41 

40 47 

46 52 

2 7 

4b 57 

4b 

62 

4b 67 

46 

72 

4b 77 

4b 

82 

48 67 

4b 91 

4H 98 

47 01 

2 8 

47 04 

47 

10 

47 U 

47 

19 

47 23 

47 

27 

47 31 

47 36 

47 40 

47 44 

2 9 

47 48 

47 

52 

47 M 

47 

59 

47 hJ 

47 

87 

47 71 

47 74 

47 78 

47 SI 1 

3 0 

47 R.4 

47 

88 

47 92 

47 

95 

47 96 

46 

02 

48 05 

48 OH 

4B 11 

48 M 

3 1 

48 17 

48 

20 

48 23 

48 

26 

46 29 

48 

32 

48 35 

4H 97 

46 49 

48 43 

3 2 

48 48 

48 

48 

48 51 

48 

53 

46 58 

48 

56 

46 81 

4H 03 

46 bS 

46 86 

a 3 

48 70 

48 

72 

48 74 

48 

76 

48 79 

48 

81 

46 63 

46 85 

48 67 

46 69 

a. 4 

48 91 

48 

93 

48 95 

<48 

97 

48 98 

49 

W 

49 02 

49 04 

49 05 

49 0? 

3 4 

49 09 

49 

10 

49 12 

49 

U 

49.15 

49 

17 

49 18 

49 20 

49 21 

49 29 

a 9 

49 24 

49 

28 

49 27 

49 

26 

49 80 

49 

31 

49 32 

49 33 

49 35 

49 38 

S 7 

49 37 

49 

38 

49 W 

49 

41 

49 41 

49 

43 

49 44 

49 45 

49 48 

49 47 

8 8 

49 48 

49 

49 

49 50 

49 

51 

49 52 

49 

83 

49 54 

49 55 

49 56 

49 67 

a 9 

49 97 

49 

S8 

49 59 

49 

60 

49 61 

49 

81 

49 83 

49 63 

49 64 

49 84 

4.0 

49 89 

49 

68 

49.67 

49 

67 

49 66 

40 

68 

49 69 

49 70 

49 70 

4ft 71 


45 4> 58 

6 0 4* 583 

5. 5 49 9396 

8 0 49 9574 


/.V wvaa 

8 0 45.99999M 


' b dM5Jac4 5lM>f JiultdirMcd l>7 ;r«t«bl« mar. 
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.6745a is called probable deviation, and a probable error is 
.6745 X standard error. 

A third table gives the areas of the normal curve under certain 
values of x expressed in terms of probable deviation instead of 
standard deviation (sigma a) values. As is to be expected, 25% 

X 

of the area on eiilier side of the central line gives an value 


Fitting a Normal Cwve to a Series of Measures given in the form of a 
Frequency Polygon 

It is better to draw the histogram or frequency polygon on 
graph paper to a suitable scale so that the paper is comfortably 
filled. The S.D. of the measures should be calculated after they 
have been grouped into frequencies. 

(1) The height of the normal curve (see Appendix III) may be 
calculated from 

N 

== TIT 

when N is tlie number of measures and a is the standard deviation. 

(2) Tlie mid-point of each interval should be calculated 
in terms of sigma units by dividing each x value by the standard 
deviation. 

(3) By using Table II the heights of the ordinates at each of 
these points is calculated. Tlie table gives these values as a pro- 
portion of this ordinate and tlic actual heights are found by 
multiplying the height of the normal curve (mean ordinate) by 
the fi^re found in the table. The curve may then be plotted 
, by joining the tops of the vertical ordinates with a smooth curve. 

Inevitably there will be discrepancies between the actual 
ordinates and those obtained from the perfect curve. The sum 
; of the theoretical frequencies of the curve should always be slightly 
less than those of the given distribution. The probability that a 
given distribution has discrepancies (which make it differ from 
la theoretical distribution) which are not due to chance can be 
[found by using Chi-sqnar^ and consulting the appropriate tables. 

1 
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The cun'c has some other uses in education.il statistics. It can 
be used for setting standards for the distribution of marks, to 
assign \alucs of difficulty to questions in a test, to give numbers 
of pupils in equal ability or talent ranges, for making scales for 
measuring various f.ictors in .'ddition to those of a purely cognitive 
type. It is often convenient to consider the lurvc .is extending 
from — 3a to -f 30 or c\ en from - 2.3 o to i Jt a only. The 
student will bc.ir in mind the nature of the small errors so 
introduced. 



Fig 16 



CHAPTER VI 


MARKING AND ITS PROBLEMS 

I T is both amusing and disturbing to think that in many schools 
and colleges, lists of marks tshich have been produced in an 
arbitrai-y and entirely unscientific manner are thought to have 
an absolute value which bears no relation to the means by which 
they are obtained. For weal or woe no small part of the work of 
many teachers is tlie piotluction of mark lists and the compound- 
ing of marks. It is well to give a little thought to the foundations 
of our beliefs concerning marks, particularly w'hen these have 
been regarded as sacrosanct and as a type of numerical label by 
which one individual diffeis fiom anotlier. A moment's thought 
W’ill SCI ve to show the limitations of certain marking systems. It 
would be a bold man who in maiking two essay's would give 
thirteen marks out of twenty to one and fourteen to another and 
be cert.iiii that the second was 5% better than the first! It would 
be a still bolder man w ho insisted that he was sure, in an English 
examination of tlie old type, that a candidate with g6 marks out 
of 100 was I "o better tlian another with 95 marks. 

We can begin by summarizing tlie chief uses of marking 
systems: 


I . To obtain an order oj merit list 

This is the popular use of marks in the schools. In order that 
there shall be a good spread it is necessary to devise a test which 
will give a normal distribution of the marks, or something 
approaching it. If tw'o pupils have the same mark they will 
occupy the same place and the next pupil in order of merit will 
have the next but one place. If the mark list in order of merit is 
to be used for correlation purposes cither by Spearman’s method 
of ranks or by the ‘footrule’ it is wise to consider more cardully 
these ‘tied’ places. 

7T 
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e.g. The following is a portion of an old school mark list. 



Mark 

Position in rank 

Thompson 

92 

I 

Allen 

84 

2 

Walker 

81 

3 

Smith 

81 

3 = 

Brown 

81 

3 

Jones 

79 

6 

Turner 

76 

7 


In this case it is better to credit Walker, Smith and Brown with 
the average place, i.c. the fourth place. In the same w ay, .suppose 
two boys ‘tie’ in the mark which comes after tlie loth place. 
Instead of putting two iith places between the loth and the igtli 
places it is wise to credit the two boys with equal maiks with 
*11^’ places each. If correlation is to be performed tliis is 
particularly important. 

2. To separate candidates who reach a certain level from those who do not 
Most of the public examinations, such as those for School 
Certificate, matriculation, degrees and diplomas ha\ e this end in 
view. At first sight this may seem easy, but it is beset with pitfalls. 
It is unwise to draw our lines of demarcation on the frequency 
curves at points where the curv'c is at its highest, for here there is 
less chance of a critical separation of one class of candidates from 
another. The standard of examination papers and of students 
taking the examination varies from year to y^gx.. It is difficult 
or impossible for an examiner who has set an examination paper 
to know what standard it i.s by just looking at it. Only experiment 
with many trials will show, and tlus is not usually posdble. 
Examiners arc changed from year to year or after a short period 
of years. Many examining bodies ‘standardize’ the marks, by 
approximating the percentages of credits, passes, failures and even 
distinctiom respectively from year to year. It follows that in a 
year when many good candidates present themselves it is much 
more difficult to pass the examination than when there are more 
weaker candidates. 



MARKING AND ITS PROBLEMS 79 

3. Tests and examinations may be set by a teacher to test the value of his 

own work or to estimate the progress already made by a class 

Tliis should help the teacher to find what is difficult and what 
is easy to the pupils in his ow'n tcacliing, and he can amend his 
work accordingly. 

4. Examinations should also look forward 

and not only backward on the pupil’s past work. In other words, 
examinations should be piognostic. How far they have this 
quality has been the subject of considerable investigation. If the 
boy or girl at cles'en has reached a certain standard in Arithmetic 
and Engli.sh is he or she a fit candidate for a place in a grammar 
school? Entry to the old universities may be secured with scholar- 
ships if a candidate shows sufficient knowledge of and ability in 
Mathematics. Is this a sufficient guarantee of a satisfactory 
university and subsequent career?* 

Examinations arc not as reliable as they ought to be for some 
or all of the following 1 e<i.sons: 

(i) The number of questions of the older or essay t^-pe wliich 
the candidate is able to answer in tlie allotted time is so small 
that there is insufficient sampling of the candidate’s knowledge. 
Questions of ‘luck’ or ‘chance’ figure too largely in the result, 
from the candidate’s point of view. 

(a) Candidates may differ in mental and physical condition 
from day to day and this wall affect performance in the examina- 
tion. Vitamin intake, digestion, hours of sleep, mild infection, 
other physical and emotional states, the time of day, atmospheric 
and other environmental conditions and the total length of the 
examination may modify the student's work in it, or in some 
pan of it. 

(3) Particularly in the ‘Arts’ subjects there may arise differ- 
ences of opinion between one examiner and another concerning 
the value of a student’s work. 

{4) Examiners are not always consistent with one anotlier in 

* An excellent ihort exemination of examinations ia given in Cliapter XI of 
P. £. Vetnon'a The iiHuwtmeta ef Aldhttet. 
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their standards of marking. Nor will the same examiner adhere 
to the same standard at different times of the same day, at 
different parts of the week and at different sUges in marking a 
lai^e batch of examination papers. 

The compounding of mark.s is a still more dilBciilt task. Here 
the idio.s>ncrasies of a number of m.irkcrs in different subjects 
will produce anomalies in the final result which arc both unfair 
and misleading. .As so much is often made to depend on the sum 
total of a candidate’s achievement in an ‘omnibus’ examination, 
it is the duty of all concerned in the matter to investigate carefully 
what really lies behind the masses of figures which are produced 
from the several subjects of the examination. 

In a public examination, such as the Intermediate Examina- 
tions of the University of London, it may be possible to give equal 
weight to each of the subjects which arc taken; but in a school 
annual examm.ition this is not po.ssiblc, nor is it for the marks 
which are given on each term’s work. It is obvious tliat the 
maximum marks in English should be greater than Uiosc for 
Geography, just as those in Matlicmatics will usually be greater 
than those in Chemistry. The reason is the obvious one that more 
hours per week arc devoted to English than to Geography, to 
Mathematics tlian to Chemistry. (We will leave the problem of 
relative importance from other points of tiew, tliough few would 
contest the superior position ofEnglisli in the school curriculum.) 

A reasonable way of treating the marks of tlic respective 
subjects before compounding them would be to arrange each 
maximum mark .so that it is proportional to the time devoted to 
the particular subject each week. 

Suppose 5 hours arc spent on English, 4 hours on Mathematics, 
3 hours on Science and 2 hours on History. We might allow a 
term’.s maximum of 200 marks for English, 160 for Mathematics, 
120 for Science and 80 for History. It may happen that the total 
for ail subjects will come to some large number which is not a 
multiple of a hundred. Whatever the total maximum, that is, 
the total of the maxima of all the subjects, an order of merit can 
be found just as easily, and if a percentage is required of the 
ma^titn um score this can su t»equently be found by simple reduction . 
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There is usually a more <.erious difficulty in the compounding 
of marks. Some markers feel that a nijrmal distribution of marks 
tends to depress and discourage all but the top quartile division 
of the candidates, whilst others feel that they may force their 
students to strive for better ultimate examination results by 
marking stiffly the work and tests of die term. .Again, others find 
marking so difficult that they are only able to separate from the 
mass of papers the very poor candidates and the very' good ones, 
and all others are bunched together with very little spread or 
dispersion of marking and ,1 rather high average usually of about 
55%. This makes the compounding of marks difficult. Wc can 
do somctliing to adjust the v.uioiis marking scales which will 
improve matters somewhat. Kach m.nk may be regarded as a 
positive or negative deviation fioni the mean which is called o, 
or the marks may be st.iudauli/.ed by dividing these deviations 
by the standard deviation. All this would involve much labour 
which would certainlv not be welcome .ind might not be possible 
at the end of term. The marks might be impioved for Uie pur- 
poses of compounding by adjusting the marks in the interquartile 
range by nic<ins of a graph. 

Another useful expedient is to adjust the marks by means of 
a straight-line graph so ih.it the top boy gets the ma.ximum marks 
and the bottom boy no marks. (The objection to this is that the 
top boy may not be worthy of the maximum marks just as the 
bottom boy will probably deserve something better than zero 
marks.) All the objections in theory' aic met, liowev'cr, by the 
very practical result that the resulting order of merit is much 
fairer to all concerned. ^Ve have said enough to show that no 
system of marks is entirely above criticism, and if wc keep in 
mind the difficulties of marking and compounding our marks our 
system will progressively improve. 

Most teachers soon evolve a personju system of marking, and 
it is well for all who liavc to mark the work of pupils and students 
to explore the fundamentals of their own ideas on the subject. 
It is more difficult to mark papers of the essay type than those of 
the new style where there are many shorter questions, which 
usually only require a sentence in answer to each, or even the 
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choosing of a correct word or sentence from a number which are 
given for each question. It is ob\'iously more diilicult to mark 
an English essay where style is taken into consideration than an 
arithmetic paper where a marking scheme can be followed fairly 
closely. Where marks arc deducted for errors, markers should sec 
that the total reduction bears a reasonable relationship to the 
marks credited for correct work. Until much practice has been 
obtained in marking and the marker has subjected his work to 
careful examination, it will be inevitable that a careful re-marking 
of a batch of papers, after a first assessment, will be desirable. 
This will enable the earlier papers in a batch to be adjusted to 
those which have come later and have been marked in ‘a state of 
mauuity' for that particular examination. Some conscientious 
examiners arrange the papers in order of merit as shown bv their 
marking and then re-read them in descending order of merit, 
satisfying themsehes that each paper is a little less worthy than 
the one which preceded it. If the examination and the candidates 
have been fairly matched the marks should be distributed in a 
normal manner or in an approximation to it. In the case of fairly 
homogeneous small groups (e.g. the mathematical ‘sets’ of a large 
fifth form) it is difficult to obtain the requisite distribution of the 
marking. It is obvious that the larger and more heterogeneous is 
the group the easier will it be to obtain normal distribution. It 
may be allowable in a scholarship examination when only a very 
few of the finest candidates can obtain awards to permit a slight 
positive skew to the distribution and thus give a better spread in 
the upper reaches of the marking. In the same way it may be 
permissible to allow a little negative skewing if the intention of tlic 
examination is merely to reject a few candidates who fail to secure 
a minimum of marks less than 40% or 50%, but tlie fact remains 
that for general purposes normal distribution should be aimed at 
and the marks which separate one class or degree of merit from 
another should not coincide with the mode (which in the case of 
normal distribution would also equal the mean and the median) . 

A simple problem in connection with marking is the reduction 
of marks. The marks have been given to one maximum mark 
and it is desired to reduce or translate them to another aetde with 
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a different maximum. It is presumed that it is not desired to 
interfere with, or endeavour to modify in any way, the relative 
distribution of the marks which would be best achieved by 
drawing a cur\'ed-line graph. 

The simple task of ‘reducing marks’ is best effected by one of 
three ways: 

( 1 ) Using a slide-rule 

(2) Drawing a straight-line graph. 

(3) Multiplication of the marks by an easy fraction. 

1. Using a slide-rule. * This simple instrument permits multi- 
plication and division sums to be performed by adding or sub- 
tracting lengtlis of a ruler. A.s the standard engineer's slide-rule 
permits the use of various functions and is a more complicated 
instrument than we require for the simple reduction of marks, 
some schools possess a large slide-rule wliich is graduated for 
multiplication and division only. Suppose we have marked to a 
maximum of 120 marks and we wish to i educe these marks to 
a maximum of too, that is, to express them as a percentage of the 
maximum. We take the slide-rule and move the lower scale (B) 
so that the graduation 12 on it corresponds with 10 on the uppei 
scale (A). The given mark is found on scale B and the reduced 
mark is read opposite to this on scale A. 

2. A ‘ready-reckoner’ table can be made in convenient form by 
drawing a straight-line graph. It is best to use graph paper where 
each large division contains ten (and not five) small divisions for 
this will facilitate reading the graph. To take the case given 
above. A point on the graph paper, on which axes have been 
drawn horizontally at the bottom of the paper and vertically on 
the left side, is found which corresponds to the maxima in the 
given and on the reduced scale. This will be the point with x value 
120 and j> value 100. The point i2-io (counting in large squares) 
is found and joined to tlie point o (the point of intersection of the 
axes) and the resulting straight line is the graph required. It is 
only necessary to find the corresponding value on it when an x 
value (that is, a mark on the 120 maximum scale) is read off 
horizontally. 


* See Appendix li. 
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3. Many simple reductions can be performed by rapid mcnt<d 
arithmetic. Reductions to a half or a tenth or by tvvo-tlurds, a 
fifth and so on would give no trouble. A reduction w'hich fre- 
quently occurs is from 25 to 10 as maxima. This is equivalent to 
dividing by 5 which is equal to Thus wc divide each mark 
on the 25 scale bv 4 and multiply it by 10 by shifting the decimal 
point one place to the right. The reduction from a maximum of 
120 to one of too is equivalent to multiplying by the fraction J 
or 

\Iost people could achieve this very tiuickiy by adding a nought 
to each mark on the 120 scale to multiply it h\ lo and then 
dividing each number by 12. .Some conscientious teachers «ho 
find difficulty in handling figures obtain their reduc tions by one 
method and check them with another. 

The importance of the transfer examination wliich Ls now- token 
by all children in state-controlled schools at the end of their 
primai7 school life has become greater, not less, sinte the pa.ssmg 
of the Education Act of 194.1. In view of the fart th.it the whole 
subsequent life and career of a child may be mtidified by the. type 
of secondary education which he receives, it is hardly necessary to 
say th.it anything which c.in l>c done to improve tlie transfer 
examination, wliich is taken at about the age of elcsen, should be 
regarded as a matter of prime importance. VVe should look upon 
the test as one wliich should hate a prognostic value. .Although 
statistical analysis in these m.'tttcrs is probably of less importance 
than the sound framing of the test papers, it is only by mathema- 
tical investigation that wc can be assured that wc are on the right 
lines in our examination nicthoiU. Much yet remains to be done, 
but all honour should be given to Professor Godfrey Thomson, 
tyho has devoted many years of his life to these problem,s and 
with his staff has evolved the Moray House Tests. It is obvious 
that the standard of the tests sliould be maintained from year to 
year and that the tests should aim at determining the type of 
secondary education which will best fit a particular cliild ratlier 
than testing the attainment and factual content of the child. 
Accordingly, tests in Englisli, Arithmetic, and a General Paper 
which seeks to explore the native capacity (often called intdli* 
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gencc) of the clxild, aic prepared with this end in view and are 
standardized by exhaustive experimental tests. It is easy to 
imagine that ideal tests foi children of 10-11 cannot be evolved 
by ‘an armchaii pioccss', but onh painstaking trial and error 
and caicful analysis of the results will suffice Even so, no ideal 
tests have vet been found, and theie is still at least 10% error in 
the pinghDsiic value of most tiansfcr tests. Nor is the underlying 
psychological theorv a matter on which there is complete agree- 
ment between eminent authorities. It is believed that the average 
verbal ability of girls at the tiansfer .age is somew'hat gre.ater than 
that of boss. This ddlcrcncc would appear to be accentuated in 
the case of childien from the country, that is, from small villages 
rather than from towns. Di W. P. Alexander has stressed 
rcpe.itedly and with justification the necessity of allowing for 
noii-veibal abilities in tiansfcr e.vaminations, and he would 
divide abilities bv me.iiis of oblique factors’ into verbal and 
non-veib.d tvpcs. Enough has been .said to show tliat the serious 
student interested in the tiansfcr e.x.imination will find much data 
which can be cxploietl b\ .statistical methods and will ^ield useful 
results. These must still be rcgaidcd as being valuable even when 
ihev only scivc to show us the weaknesses of our methods and do 
not alwavs ofl'er anv ideas for their improvement. 

In conneition with transfer e.xaminations and attainment tests 
an important matter susceptible to statistical treatment is the age 
allowance in m.irking schemes. 

Some education authorities permit only a single attempt at a 
transfer cx.imin.ition, and there is thus an age range of a year. 
Allow ance is made for diffei cnees of not less than a month. Other 
authorities have an age range of two vears or even more an.d 
permit two attempts at the examination if neccssarv’. In fixing 
this allowance it is wise to make experiments with large numbers 
of children of various age groups and to use general papers 
containing tests of ‘intclUgcncc’, English and Arithmetic rather 
than papers of more limited scope. Wc could set a scries of papers 
to children in age groups of 12, ii and 10 respectively, and find 
the median score for each paper (or set of papers) and for each 

* S«e page 109. 
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group. The median score or norm would show an increase when 
using the same paper from year to year. By drawing a graph for 
each paper (or set of papers), using liie three points of the la, 1 1 
and to not ms, we find tliat we can find a straight line which 
practically goes through the three points in each case. If we use 
the graph to call the i2->ear norm too, we can read off the 
I i-year and lo-year norms on this settle. The graphs obtained 
from the median scores of the other sets of papers will have 
different slopes, but svhen the la-ycar median score is called too 
and the other norms multiplied by the same fraction or rcatf off 
on the graph we shall probably find that the other norms differ 
a little for the same age group. The avciagc is then taken. 

Suppose th.it the difference aseragrs abotit 2.^®;, per year. At 
first sight it may appear that 2% should be .tdded to the marks 
of tlic candidates for e\-cry month of his age below 12 years. This 
would probably be unfair as 2% of a lovser mark is obviously less 
than that of a higher. To overcome this several metluxis are 
employed. We can take the age of the pupil with the greatest 
number of marks and reckoning two marks per month as an age- 
allowance scale up his marks to those which would be expected 
if he were 12 by mcan.s of a graph or a slide-rule. The 
corrections work out as follows: 


Agt 

Ptr cent 

12.0 

100 

I i-ii 

98 

I I.IO 

98 

JI.9 

94 

1 1.8 

92 

11.7 

90 

it.6 

88 

11.5 

86 

11.4 

84 

1 1.3 

82 

11.2 

80 

II. 1 

78 * 

11.0 

76 

etc. 

etc. 
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Thus we should find the percentage corresponding to the age of 
the pupil and multiply his marks by a fraction with this percentage 
in the denominator and 100 in the numerator, e.g. Suppose a boy 
of 1 1 years 4 months obtains a total of 362 marks. His expected 
achievement at the age of 12 years precisely would give 

- 100 

sj- 


or 43 1 marks. 

The matter may be regarded from another angle: we have 
obtained norms for each age group and by interpolation we can 
obtain norms for each month. Every pupil’s marks will corre- 
spond with a particular age norm and therefore we could giv'e 
an assessment of the achievement of each pupil in terms of his 
test or examination age, that is, the number of months above or 
below av'erage as an equivalent of a greater or lesser ability than 
the normal for his age. 



Fig. 17. Peicentile curves for four three-nxMith groups. XV represents an age 
allovwufe for 9 months at a particular percentile level. This level roust then be 
interpreted in terroa of the scores of the whole of the candidates from a separate 
curve. For convenience percentiles have been reckoned from the highest score. 
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In transfer examinations some authorities, following a method 
similar to that which we have outlined above, have a table of 
percentages of rnaiks which are added to the total scores of the 
children according to tlieir ages A cruder method which is 
emplosed bv othcis is to have ^ table of maiLs and add an 
appiopriate numbci to iht»sc of a child in legard to his age but 
without regard to his .u hieseinent Stiictly speaking, the percen- 
tage or ptoportioiial was of making the increase is the only 
equitable \s'a>, for the method of adding fixed nurnbeis of marks 
according to age benefits the vscakci childien at the expense of 
the more able. 

The best method for ordinary use and one whitli does not 
evolve a great deal of labour is that due to Thomson. ‘ The total 
marks {or those in separate subjects^ for c\cr> child are divided 
into four age gioups ii.o \ear» to 11.2 years inclusive, 11-3 to 
11.5 yeais. n-b to n-R sears, ii.c) to ii-n >ears. Cumulative 
Ircquencs /percentile 1 cuises are dras^ti for tlie maiks in each 
group. The abscissae diflcicnccs brtsveen the fust and the fourth 
curves give the diflciciues in maiks fonc.sponding to a 9 months’ 
age difieicnce It smH be noted that this diircrence is one of 
g months and not of 12 months an each curve is foi the average 
age of the three-month age group, that ls, the first cui’vc is centred 
on an age of 1 1 s cai s 1 1 months and the last on 1 1 years 
10^ months. It is noss' necessary 10 interpret these in terms of the 
jjcrcentiles and maiks of the sshole i i->ear group taken Ibgctlier. 
Usually no child tinder 1 1 is given more than the allowance 
for II.O sears The mark diffcicnte for 9 months is dKnded by g 
to give the monthly adjustment for each score level. Equivalent 
marks are .subteictcd lor childien from 12.0 years to i2<ii years. 

There remains the question of the ideal mark .scale and the 
mark value of each question in a given test. These matters can 
best l>e undersbxid i>> further reference to our curve of oortnal 
distribution. It w'ill be seen that if we draw vertical Uno at 
distances of 30 on each side of the central point the area enclosed 
b>' these lines and the curve is practically the Wliole of its area. 
Now, the area of the curve gives the frequency or the nurdb^ of 
* S«« The British Jourtul ^ Edueatiomd Jhjttkokgy, 1936. 
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cases or scores, and only .2°^ of the srores He beyond tlie 30 lines 
at the left and right extremes of the rtirv^c. {This will be clear 
from our short chapter on tlje normal curve.) If instead of 
drawing our vertical lines at points 3a fiom the centre we choose 
points at a distance on c.ich side of this point, the area of the 
curv'c thus enclo.sed is 911-76“;, of the vs hole, that is to say, we have 
omitted onlv i-a4"„ of the whole .scores. -Although we h.ive made 
slight .sacrifices to aeciiraiv it is vciv convenient to hav'c a base of 
5a instead of 60 because we can 111010 rcadiiv divide it into a ten- 
or a hundred-part scale, and for our purpose hcie this arrange- 
ment is tjuitc acuiiatc enough 


e 



Suppose now that we divide it into to equal divisions .dong its 
base, and furtlier let us imagine that in u test vse liavc tliis nurnbci 
of projicrly gr.ulcd quc.stions, so that on drawing a graph showing 
the number of persons solving each tiucslion vve get :t distributior 
curve of the normal type. 

The scale of ability is taken to be similar to tliat of the scale 0 
difficulty of tlic questions. Now area 'a' is e<]uivalcnt to th< 
number of those who cannot solve Question t. Similarly are. 
'ob' represents the number of those who cannot solve Question 2 
*abc' those who cannot solve Question 3, and so on. Obviousl; 
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the mark value of a question should increase with the proportion 
of people who fail to solve it. For instance, by consulting the 
tables giving the proportions of curves of normal distribution 
which are cut off by oidinates at particular distances from the 
central point,* \vc can find that the area abedefg is appro.'dmately 
85 % of the area of the whole cur\’e. Hence Question 7 would be 
too hard for 85% of llie candidates but it could be solved by the 
remaining 15% (assuming that the time factor did not enter). 

Thus if a question is solved by 15% of the candidates it will be 
of difficulty 7 and take this number of marks. 

VVe can take the matter a step forward by drawdng a percentile 
cui-ve showing the percentages of candidates failing to solve each 
problem according to its difficulty and the marks which will be 
given to it. 

The student will find the construction of such a curve and the 
following tables an easy exercise in the use of the normal distribu- 
tion or probability-integial tables: 


Matks per 

1 % able to 

Ya failing 1 

quntion 

' solve U 

solve it 

i 

1 9835 

1.65 

2 

I 94 

i 6 

3 

1 85 

i 

4 

1 70 

! 30 

5 

i 50 

1 50 

6 

1 30 

i 70 

7 

15 

! 85 

8 

6 

94 

9 

2 

98 

10 

{ almost 0 

almost too 


In order not to break too much with time-honoured custom 
and yet maintain a system which peimits a mathematically 
reliable compounding of marks, some authorities regard 90% as 
the highest mark and 30 % as the lowest in all but exceptional 
’ cases. Only one candidate in several hundred or even a thousand 
ijs regarded as being so excellent that he achieves more than 90% 

• See page 70. 
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or so feeble that he scores less than 30%. This method, being used 
by schoolmasters and in certain of the public university examina- 
tions, obviously implies a certain degree of homogeneity resulting 
from the selection of the more able individuals from the population 
at large. 

A reasonable dispersion would be given by a standard deviation 
of 10 and, assuming a normal distribution, a median of 60. In this 
case the percentages of candidates expected to achieve scores in 
various mark groups would be as follows: (The extreme upper and 
lower reaches of the marking are reserved for candidates of rare 
brilliance or poverty of achievement.) 


Mark % 

% tn each gre 

92-88 

up to i % 

87-83 

I 

82-78 

3 

77-73 

6i 

72-68 

12 

67-63 

17 

62-58 

20 

57-53 

17 

52-48 

12 

47-43 

6i 

42-38 

3 

37-33 

I 

32-28 

up to J ) 


In practice, things do not work out quite as easily as this. Marks 
have to be allowed in many cases for answers which arc partly 
correct and in many tests a choice of questions has to be per- 
mitted. In the ‘new-type’ examinations the number of questions 
would be much larger tlian in the old type and answers would be 
right or turong, for the most part. Also, in view of the laiger number 
of questions, proper sampling of the candidates can be achieved 
and there is no need to permit selection on the part of candidates. 
Nevertheless, in any type of examination a proper order of merit 
will only be secured by a proper grading of questions in difficulty, 
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with a weighting of marks in accordance with the requirements of 
the curve of normal distribution. It is not pretended that practical 
achievement in examining can match up to theoretical ideal 
demand.s but a more careful mathematical analysis of each test 
will go far to improve a system of examinations which has not yet 
been replaced as a means of assessing ability and achievement. 

In a work well known to the point of notoriety Hartog and 
Rhodes produced evidence to show the unreliability of examina- 
tion. No doubt 'An examination of examinations’ was intended to 
make our flesh creep, and to sustain their thesis the authors chose 
cases which did all they could to show the subjectivism of marking 
in the woist possible light. Most of the sets of scripts which were 
used for their experiments were more homogeneous than we should 
ordinarily find. Such sets of papers always present difficulties and 
it is well known that to secure a distribution which approaches a 
normal one ssc must use a large and heterogeneous group. Never- 
theless, the work of these authors did much to bring a realization 
of the need for more care in examinations no matter at what level. 

On the other hand, the value of ex<»minations and the care and 
thought with which they arc conducted has been finely expressed 
by Bierelon in The Case for Examinations. It is a step forward 
if only average marks and standard deviations or interquartile 
ranges are equalized between one examiner and another or 
between one subject and another befoie marks are compounded. 
There is an increasing awareness of the necessity of this, and that a 
failure to do so will lead to erroneous and anomalous results in 
final order of merit lists. 

It must not be assumed that tlic new type of test is in all ways 
superior to the old, or that it is free from defect. Vernon in The 
Measurement of Abilities has given an excellent analysis of this 
matter. Much more time, skill and experience are necessary for 
the production of the new type test-paper containing many 
graded questions, but time is saved in marking the scripts. Unless 
the number of scripts exceeds 300 no time is saved on the aggre- 
gate of setting the papers and marking the scripts. The examiner 
must decide just which type of question suits his purpose for the 
subject matter in hand. The questions may be divided into the 
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following types: (a) Simple recall and ‘open-completion’, where 
blank spaces in the question have to be filled in. (6) True-false 
where there is a set of statements some of which are true and some 
false. The candidate has to indicate ‘which is which’, (c) The 
Multiple-choice type, including best reason and matching items. 
In each case a number of alternative answers arc given. One is 
correct and this is to be underlined by the candidate, (t/) Re- 
arrangement type. Here a list of items which should fall into a 
unique order is given in the wrong order. The candidate must 
rearrange them to give the correct order. 

In the new-type tests a certain number of correct answers in the 
recognition-type of test may be obtained by chance guessing This 
only means that the zero level in scoring is equivalent to a score 
which could be calculated as being the percentage of marks which 
might have been obtained by pure chance. The marks obtained 
may be corrected for guessing by using the formula. True score 
W 

= R where R is the total number right and \V the total 

n — I 

wrong and n the number of altei native answers provided for each 
question. It has been shown that the above correction only makes 
appropriate compensations for the average candidate. On the 
whole the effect of guessing is much less than the layman would 
imagine. * 


Mental Ages and Intelligence Quotients 

The Mental Age (M.A.) of a child as given by an intelligence 
test. Its Educational Age (E.A.) as given by educational tests is 
equal to the actual or Chronological Age (C.A.) of an average 
child with the same test scores. Intelligence Quotient is given by 

' The system of markmg at most musical festivals and competitions seems to be 
extraordinary. Even very poor efforts are not mfrcquently given upwards of 7$% 
and the majority of candidates obtain more than 85%. This is obviously intended 
to hearten all candidates and to maintain enthusiasm fur subsequent occasions. 
Nevertheless, the adjudicator’s task is rendered difficult by this system and his final 
marks are perforce given by reference to an order of merit resultmg from a quick 
coiiaideration of the qualities which make one competitor or group sli^tly better 
than another. The adjudicator needs good experience, judgment and memory. 
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Mental Age 


. and is often expressed as a percentage. At 6rst 
Chronological Age 

sight these may seem to be a much simpler and more straightfor- 
ward method of desciibing attainments or abilities than the use 
of percentile levels. There are some difficulties, however. To 
start with, the growth of intelligence and educational abilities are 
not regular year by year. The upper limits of achievement vary 
from child to child. After the age of eleven the intelligence-test 
scale becomes so unreliable and artificial that it is wise to abandon 
M.A. units from the age of 12 upwards. The proportional ad- 
vancement or backwardness of a child whether in educational 
achievement or intelligence tend to increase with increasing age. 
M A E A 

The fractions (i.e. I.Q,.) and (i.e. E.Q..) keep reasonably 

Li.A. Ci.A. 


constant for a number of years. 

There is nothing absolute about a scale of intelligence ‘norms’, 
or the marking scale of an intelligence test. Unless all intelligence 
tests (in addition to all the other desiderata) are standardized as 
regards mean or average and st<mdard deviation, statements of 
I.Q_. measurements will be <imbiguous. We can only sav ‘the I.Q,. 
of Smith as me.isuicd by this or diat particulai test is \’. The 
Moray House Tests yield an avciage score of 100 and an S.D. of 15. 
The Stanford Binct tests were formerly believed to yield an S.D. 
of 15 but this is now known to be 16J. In fact the S.D.s of intelli- 
gence-test scores vary from 12 to 25 (with a mean score of 100). 
The matter can only be made accurate by expressing differences 
in achievement in standard deviation units (sec page 26).* 

We have left until last a short statement of the chief difficulty, 
and one w’hich is perhaps not apparent at first. It is that of estab- 
lishing age norms. It is practically impossible to take a sufficiently 
large sample which will represent all possible children of any age 
group. In primary school life it is perhaps passible if we cast a 
wide net to find groups which give us a fair sample of the total 
population, but even here it is difficult to allow for the children 
(either bright or dull) who attend private schools or those w'ho 

* This section should be followed up with Chapter X of Vernon’s The Measure- 
ment of AbibUes. 
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go to special schools. After the age of 11, with the children in 
various types of secondary schools the problem becomes even more 
difficult. There is still room in the field of simple research by 
teachers for experiments using intelligence tests w'ith children of 
various ages, physical types, ‘social’ positions, localities. Although 
many hundreds of thousands of such tests have been given there 
is still no shortage of opportunities for their use. In rare cases it 
has been possible to test all the children of a certain age or from 
a certain locality but more often the best that can be done is to 
select them from as many schools as possible in different districts 
to give as wide a range of social and economic differences as 
possible. 

To Standardize an Intelligence Tett 
If we could give the intelligence test to very large numbers of 
children in year groups of 10, 11 and 12 (making sure that each 
group is truly representative of all children of that age), we could 
plot the three averages as equally-spaced ordinates on a graph 
and join the points. This would yield a straight line sloping up- 
wards and by interpolation we could read off the montlily norms. 



Fig. 19. The line of beet fit is found by the method of leest sipuues. 
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It would be convenient to have each of the ordinates separated 
by 1 2 units of abscissae in order to facilitate these monthly inter- 
polations. This method would be open to many objections. The 
division into years is far too coarse and little attention is paid to 
finer differences in the 1 1 -f year which may be the most impor- 
tant from our point of view, particularly if we are interested in the 
transfer examinations at the end of primary school life. Moreover, 
errors of sampling and distribution cannot be corrected by this 
method of taking the three year groups. 

A much better method is that due to Thomson.* A ‘complete, 
numerous and uncreamed’ year is tested. The year group is 
divided up into 12 monthly groups, which must be as large and 
heterogeneous as possible so that each shall be a good sample of 
that age group of the whole population. The average score in the 
test for each monthly age group is found and plotted as an ordinate 
on a graph with abscissae giving the monthly spacings. Owing to 
errors in sampling the twelve (or thirteen) plotted points will 
usually not lie on a straight line. The line of best fit has to be 
found. As usual this is done by the method of least squares, that 
is, the sum of the squares of the deviations of the ordinate points 
from the line must be made a minimum.* The straight line of best 
fit can be extended backwards to deal with the 10+ age group 
and forwards for the 12+ group. A child’s M.A. can therefore 
be read off on this line by reference to his score in the test. His 
I.Q^. can be found by dividing by his chronological age. 

Intelligence tests may also be standardized by comparing scores 
achieved in them with those in established tests such as the Binet, 
using the same groups of children. 


* See The Bntuh Journal of Educational Psychology, X932, page 99. 

* The quantity Z(u*) where the u’s are the deviations from zero obtained when the 
twelve or thirteen points obtained from the scores are substituted m the equation 
of the straight line y s tiur -f c. The values of m and c which give this are found 
from the equations: 

X (y) — m Z M — nc = o 
Z (sty) — ml (x*) — c Z (*) = o 
where x represents ages and y the scores. 



CHAPTER VII 


THE ‘FACTORS’ OF THE MIND 

By measuring we know what things are long and what 
short. The relations of all things may be thus detei mined 
and it is of the greatest importance to measure the motions of 
the mind. 

MENCIUS, c. 335 B.C. 

I N die early years of this century Professor Charles Spearman 
commenced a serious investigation into the nature of human 
abilities. ‘One of the most pernicious (of fellacies) was found to 
be the current usage of the word “intelligence” without any 
definite idea behind it. Another, that does even greater mischief 
in practice, was the irrepressible tendency to assume that terms like 
‘attention’, ‘combination’, ‘analysis’, ‘range of association’, 
‘co-ordination of hand and eye’ and so forth represent so many 
functional unities or behaviour units. Alongside of these two great 
impediments to the advance of science has been the pseudo- 
explanation of the tests of a person’s “intelligence” as measuring a 
‘level’, ‘average’ or ‘sample’ of his abilities whereas really no 
measurement is conceivably po.ssible.’ * The works on educational 
psychology have persisted in telling us that the ‘faculty’ psychology 
is dead (which should be true) but there has been a tendency to 
resurrect it in terms of mental factors. 

Spearman investigated five ‘laws’ quantitatively: the laws of 
Span, Retentivity (inertia and dispositions), Fatigue, Conation 
and Primordial Potencies (including such influences as those of 
age, sex, heredity and health). It was in these investigations in 
which he attempted to put certain aspects of psychology on a 
scientific basis that he made great use of correlation coefficients 
between tests, and examined them by mathematical analysis. At 
first it was necessary to achieve a ‘Copcrnican revolution’ in point 

* C. Speannan, The AMiUet 0/ Man, pages 4^-ie. 

98 



THE ‘FACTORS’ OF THE MIND 


99 

of view. Instead of postulating ‘an ill-defined mental entity the 
intelligence’, and then by ‘intelligence tests’ trying to obtain 
a value for this, he started with a perfectly defined quantitative 
value 'g' and then demonstrated what mental entity or entities 
this really characterizes. 

Spearman showed that the coefficients of correlation between 
tests tend to tall into ‘hierarchical’ order an d he furth e r dem on- 
strated that this was consistent with his ‘Two FactOT^ theory. 


An example will suffice to show how this works out; 

Suppose the correlation coefficients betvs-een a number of tests 
I. 2. 3. 4. 5. 6. are written down in rows and columns as follows:* 



I 

2 

3 : 

4 

”5 

“ ”6 

I 


fit 

fit '1 

fit 

fi. 

fit 

2 

fit 


ftt '■ 

ft. 

ft. 

ft. 

3 

fit 

ftt 

1 

fsB 


ft. 

4 

fi» 

^ 84 

^»4 { 


ft. 

ft. 

5 

fii 

ft. 

ft. ; 

ft. 


f.. j 

6 

fi. 

r 

j 

ft. 

ft. 

! 


The tests which give each correlation ratio arc denoted by the 
subscripts of r. e.g. r,, is the correlation coefficient between tests 
3 and 4. The above arrangement of lows and columns is known 
as a Matrix and in research work on psychological tests the 
elementary propeities of such sets of numbers are of prime 
importance. 


' The coefficient of correlation between two sets of measures is the proportion of 
the total vananie which is due to the common factor in each test 


+ < 


where o,* is tlie variance due to the common factor and the total variance. 

Note that variance is the square of the standard deviation and that variances may 
be added algebraically. 

The exact nature of the tests in this case is of secondary importance Examples 
would be: Analogies Opposites: Resemblances; Under standing instructions; 
‘Completion’.*' ^ — 
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Let us consider the matrix rewritten with numerical correlation 
coefficients: 


Test 

1 

2 

3 

4 

5 

6 

I 


.48 

•24 

■54 

.42 

■30 

2 

.48 


•32 

.72 

.56 

.40 

3 

•24 

•32 


•36 

•28 

•20 

4 

■54 

•72 

•36 


•63 

•45 

5 

•42 

.56 

.28 

•63 


•35 

6 

.30 

.40 

•20 

•45 

•35 


Total 

1.98 

2-48 

1-40 

2-70 

2-24 

1.70 


We have added up the coefficients in columns and now proceed 
to rearrange the matrix so that the totals of the columns arc in 
descending order of magnitude thus; 


Test 

4 

2 

5 

I 

6 

3 

4 


•72 

•63 

•54 

•45 

.36 

2 

•72 


.56 

.48 

.40 

•32 

5 

.63 

.56 


•42 

•35 

.28 

1 

•54 

.48 

•42 


•30 

•24 

6 

■45 

.40 

•35 

.30 


.20 

3 

.36 

•32 

•28 

•24 

■20 


Total 

2-70 

2-48 

2-24 

1.98 

1.70 

1-40 


In this ideal case* the ‘hierarchical order’, as Professor Spearman 
called it, is easily seen. The correlation coefficients in any two 
columns have a constant ratio to one another. Consider the last 
two columns; 


•45 

.36 

.40 

•32 

•35 

•28 

.30 

•24 

•20 

•20 


* Given by G. H. Thomson, The Faetonal AnalytU of Human AMtty. (The 
hyi^thetical coefficients have been chosen to demonstrate the principle in the 
eesiest way). '' 
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Ignoring those coefficients which are not paired it is easily seen 
that there is a ratio of 5 : 4 between the left and right columns. 
In other words each coefficient on the right is | of that on the left. 

This precise relationship would not be apparent in actual tests 
but the tendency would still be evident. Spearman explained this 
hierarchical order by a common factor ‘g' which was present in 
each test but in the largest quantity in that at the head of the 
hierarchy. Each test also contains a specific factor which would 
not be found in any other test unless similar varieties of the same 
test had been used. A test is said to be ‘saturated’ or ‘loaded’ with 
g to an extent depending on its place in the hierarchy. Suppose 
it were possible to devise a test of pure ‘g’, that is to say, one com- 
pletely saturated with ‘g’ and containing no specific or ‘s' factor. 
Such a test would stand at the head of hierarchy. The self-correla- 
tions of the tests are ideally unity and in the diagonals of the 
matrices have been left blank. In the case of the self-correlation 
of pure ‘g’ it can be witten in and this number (unity) will con- 
form to the hierarchy. In the other unities the ‘specifics’ enter and 
they are omitted as they do not conform to the rule of propor- 
tionality between the columns. 


We may now rewrite the matrix including ‘pure’ g: 



g 

a 

B 

HI 

n 

Hi 

/ 

g 

I 


H 


.. « 



a 



•78 

•63 

•54 


■36 

mm 


.72 


.56 

.48 


•38 

H 


.63 

•56 


•48 


■28 

mm 


•54 

.48 

•48 



•24 

$ 


•45 

•40 

•35 



.20 

f 


.36 

•38 

•28 


•20 



^ correlations or saturations of 
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the tests a. b. c. d. e. f, with g. Let us examine the first two 
columns: 


I 

*•« 



.72 


•fis 


•54 

L* 

•45 

'A 

•36 


Tetrad Differences 

We have already noted that in the hierarchical order the 
correlation coefficients in the columns of the matrix tend to be in 
the same ratio. Let us take out any group of four coefficients from 
the matrix 

Test d e 

a .54 .48 

b .45 .40 

when .54 X >40 = -45 x .48 
or "54 X -40 — -45 x -48 = o 
This is called a tetrad difference and this one is 

Tad X rbe — rbd X fae = 0> 

Thus, another way of putting Spearman’s discovery is that the 
tetrad differences tend to be zero. 

Spearman gives his tetrad equation in the form: 

Tap X fbp — tag X Xbp — O 

When this equation holds throughout any table of correlations, 
and only when it docs, every individual measurement of every 
ability or any other variable contained in the table can be divided 
into two parts: ‘The one part has been called the general factor 

^ Those who have some knowledge of determinants will see in this a minor 
determinant solved by cross-multiplying. 
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and denoted by the letter g, it is so named because, although 
varying freely from individual to individual, it remains the same 
for any one individual in respect of all the correlated abilities. 
The second part has been called the specific factor and denoted 
by the letter j. It not only varies from individual to individual, 
but even for any one individual from one ability to another.’ ‘ 
(Spearman’s two-factor theorem is a piece of general mathematical 
analy.sis and is in no way confined to psychology.) 

(The precise mathematical expression of the divisibility into two 
parts is given in the following equation: 

fflax — Tag * gx -f- Tax . Sax-^ 

In this case the sum of the squares of all tlie scores comes to 
the number of persons (N). If we take the average by dividing by 
N we are left with the relationship. 

(saturations with g)’ + (saturations with r)’ = i. 

-I- = 1 (the ‘variance of the test') 

communality -f specificity = variance 



The area of each oval E and F, and each rectangle ABCD and A'B'C’D^ repre- 
sents the variance of on ability or test The shaded overlap represents the co- 
variance which will equal the correlation coefficient if the areas of each of the 
rectangles and oivls can be taken as unity Where this is not the case the corrdation 
is given by dividing the area ot the overlap b> the root of the product of the ovals. 

We can now express the tests in the form of equations containing 
g and j. 

e.g. T.nking a saturation of g of -9. 

•9* -t- j* = I 

s = \/i — -Si 
= 

« .436 

‘ Th* Abilities 0/ Mm, page 75. 
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Hence if z is the score of a person in the test given by the suffix 
to z 

Za = -95 + -436 

Zb = -S? + .600 Sb 

Zc = ■^g + -714 -fc 

Zd — *6/; + .800 Sd 

Zt — -55 + -866 Se 

Zf = -45 + -917 •!/ 

The six saturations with 'g' arc therefore: 

•9 -8 -7 -6 -5 -4 

and ever)' correlation coefficient in die matrix can be seen to be 
the product of two of these saturations e.g., 

.56 = .8 X .7 

or Tbc Teg X Ybg 

Now all scores and tests have been standardized, that is they 
have been given as differences from the average, being given a 
plus sign if above and a minus sign if below and further have 
been divided by the standard deviation of each set. The standard 
deviation of these ‘z' scores is therefore unity, and so is the 
variance of each test (variance = square of standard deviation). 
Thus the sum of the squares of the saturations of all the ‘factors’ 
equals unity (the total variance). 

We have already seen that the tetrad equation rn — fu fu 


A Note on Tetrad Relations. 

Adapted from Piaggio, Mathematual Gazette, Vol XVII, No. zaa. 

Suppoae that we have k sets of numbers denoted briefly by A B . . . and that 
these are expressible in terms of {k + i) other sets G, S„, S,, ... no two of which 
are correlated and zk constants nta, mi,, . . .no,nb, ■ ■ by equations such as: 

a = mag + "a *0 . • • (0 
b mbg + nbSb • . ■ (a) 

Each equation really denotes N equations as a can take any one of the values, 
OiOf . . . with a corresponding set of values for g and Sq. But and na are con- 
stants which occur unc^nged in each of the N equations. Taking the arithmetic 
mean of the N expressions of a given type (called averaging) gives us: 

avenge of a = e 
avenge of a* «= o,* 
avenge of ni o, ej rat 
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= o is really another way of writing the minor determinant 
which represents the intercorrelations of two tests with two others. 



3 

4 

I 

Ttt 

ri4 

2 

Tta 

r>4 


The process can be extended and tetrad differences of tetrad 
differences can be found. 

Suppose we extend the tetrad (or a minor determinant of order 
two) to a nonad (or a minor determinant of order three). We 
could obtain this from the correlation coefficients of three tests 


I. 2. 3 with three others 4. 5. 

1 ♦ 

6. 

5 

6 

I 

fu 

r 14 

fii 

2 


Tte 

Ue 

3 

ri4 

r„ 

r,. 


It is at once evident that this minor determinant of order three 
can be divided into four determinants of order two (or tetrads) : 

^14 

^14 r** Ti# 

^ 14 ^^14 ^ !• 

ri4 r„ — r,4 

This is done by taking the top left coefficient r,, as the ‘pivot’. 
The four tetrad differences are themselves formed into a tetrad 
and this can be evaluated. This operation is known as pivotal 

If all the numbers have been reduced to standard measure (i e., mean of numbers = 
o and a = i) these averages reduce to o, i and Tab respectively. 

From equations (i) and (2) we get 

ah = ma mb g* + wia nj g *4 + mb "a g fa + "o "6 fa fb 
from which by averaging and noting that g and t are uncorrelated 
=• iBa m* . . . (3) 

Similarly rej = me mj and so on. 

Hence fa* fej — fac fbd “ ® 

By permuting the letters a,b, e.d we get three such relations, but only two are 
independent. 

H ' 
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condensation.^ It must be remembered that the result, if not 
zero, has to be divided by the product of all the pivots except the 
last. 

If we do not include the numbers in the diagonals which repre- 
sent the self-correlation of .i test, wc can reduce the minor de- 
terminants of orders two and upwards in the correlation matrix 
and it may happen that all the minors of a particular order vanish. 
The ‘rank’ of the matrix is equal to the order of its greatest non- 
vanishing matrix (in terms of its rows) and is one less than the 
orders of the minors which vanish. 

Thurstone has shown that a set of tests can be analysed into a 
number of factors, common to each test, equal to the rank of their 
correlation matrix plus a specific factor for each test. The factor 
‘loadings’ or ‘saturations’ in each test can be determined by using 
the ‘centroid’ or ‘centre of gravity’ method. It is called the 
‘centroid’ method because Thurstone conceived it as a means of 
finding a centroid or centre-oj-gtavity in a gcometiical model. 
As we have already seen it is easy to make a model which contains 
only three vectors (whether these are test-scores or factors) but 
4 — or more — dimensional space, though it offeis no particular 
difficulty to the mathematician, cannot be modelled in the ordinary 
‘Euclidean’ way. The geometry of ‘hyperspacc’ is a logical ex- 
tension of that of three dimensions and it usually yields readily to 
analytical treatment. That is to say, instead of worrying about the 
difficulty or impossibility of making useful models w e can find and 
develop the simple algebraic equivalent.* 

Spearman’s work has not gone unchallenged. Although it is 
true to say that the tetrad differences of Spearman’s hierarchies 
were either zero, or were normally distributed about zero, it must 
be confessed that there was a tendency to consid er to o few c ases 

with the 

‘ See Turnbull and Aitken, Theory of Canomeal Matrices, or Thomaon. The 
Factorial Analysts of Human Abilities, Chapter VI. 

* 'rhe student who is not able to work through Thomson’s The Factorial Analytit 
of Human Ability or Burt’s The Factors of the Mind may obtain a simple account of 
modem work in this field in Thomson’s booklet Some Recent Work in Factorial 
AmUytit and in Burt’s review of Thomson’s books in The Brituh Joumat of 
Educational Psychology, Vol. XVII, February 1947. 


and perhaps to overlook tests which did not fit in 
hierarchy^ “ ^ * 
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Spearman and his school analysed the results of too few tests, 
and too readily assumed that all the tetrad differences were 
normally distributed about zeio. Later, many tests were found 
which did not fit in with the two-factor theoQ', and group factors 
had to be admitted. 'I'hurstonc of Chicago using a more extended 
analysis show’ed that the Spearman results were only a particular 
case of a larger generalization. It is beyond the scope of this 
introductory work to give a detailed account of Thurstone’s 
various methods. As in other cases they can be thought of in 
geometrical and in corresponding algebraical terms. For the pur- 
pose of explanation the former method is useful but it is the 
analytical processes (matrices and determinants) which arc 
actually used for calculating the factors. 

Other w orkers have found group factors, such as a verbal factor 
V which is common to a group of tests but not to all. This could 
be represented like this: 


OROl’P FACTORS WITH if AND 1 SPEARMAN’s g AND S 



Ocneral 

Giotip fat tors 

1 specific 

t 

General 

specific 

Test 

factor 

a h f 

[ factors 

1 Test 

factor 

factors 

A 

X 

>■ ! 1 

1 ‘V 

' A 

X 

X 

B 

X 

r 1 

* i 

B 

* 

X 

C 

X 

1 ' ' 

1 .V 


C 

X 

X 

D 

X 

! r 1 

! ^ 


D 

X 

X 

E 

X 

1 A 1 

1 X 


E 

X 

X 

r 

X 


1 * 
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X 

G 

X 

1 1 * 

X 


G 

X 

X 

n 

X 

* 

\ 
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* 

X 

I 

* 

1 ! X 

' » * 

1 


I 

* 

* 


As we have already seen, the pioneer work of Spearman 
described in The Abilities of Man with his g and s factors was 
limited. Doubtle.ss, he was justified in drawing the conclusions 
which he arrived at from the mental tests which he applied and 
the analysis of his results. Nevertheless, further researches have 
shown the need for more factors and the need for group factors 
which are common to a limited number of test results. Some 
method of multiple-factor analysis had to be found to deal with 
group factors and to obviate the restriction of no correlation except 
through a factor common to all tests. 
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It is beyond the scope of this work to deal with the methods of 
multiple-factor analysis. There is a considerable literature on the 
subject and the student would do well to start his study of the 
matter with Thomson’s excellent Factorial Analysis of Human 
Ability. Multiple-factor analysis has been developed by Sir Cyril 
Burt in England and L. L. Thurstone and H. Hotelling in 
America. 

The most popular method at present in use is that due to Thur- 
stone, or some modification of it. At the time ol' writing this book 
the exact nature of the ‘factors of the mind’ is still a matter of 
much discussion between psychologists. Even on the cognitive 
side of mental activity various claims are put forw'ard by different 
workers concerning the nature, number and importance of these 
factors. It is too early to decide whether they bear some relation 
to neurological qualities of the brain, whether they are mathe- 
matical artefacts, whether they are just convenient mathematical 
symbols or whether they represent fundamental quantities in 
human cognition. ‘ (Attempts to submit the affective and conative 
aspects of mental activity to factorial analysis are fraught with 
even greater difficulty. TTie factors suggested by various psycho- 
logists, which describe temperament and personality, are legion. 
Raymond Cattell has listed over i,ooo traits which he has gathered 
together and arranged in more than fifty ‘factors’. It is too early 
to see whither this will lead us. It will suffice for the student to 
know that there are well-marked personality traits, such as 
‘ascendency-submission’ W’hich are tested by questions and 
marked according to a given scale). 

A fruitful way of regarding tests, their correlations and factors 
is to represent them as vectors or straight lines. Two lines may be 
drawn through a point to represent the tests and the correlation 


’ Various feading {Mychologista in Britain and America have different ways of 
regarding factors. Thomson and Allport and Anastasi maintain that factora are 
statistical artefacts without any reality or neurological counterpart. Burt regards 
them as principles of classification described by selective operators, whereas Spear- 
man originally thought of them as fundamental functions of the mind. Guilford 
calls them fundamental dimensions of the mind and the Americans Thurstone and 
Holzmger regard factors as primary or fundamental abihties. The student need not 
be unduly worried about thia. The atomic pbjrsicist is up agairat similar problems 
when he is oonstdenng such probieroa as me idea of the ‘reflUty* of an electron. 
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between them is numerically equal to the cosine of the angle made by 
the two lines. The point of intersection of the lines represents 



a person who has made an average score on botli tests and 
other points on each line represent standardized scores in the 
tests the positive direction being shown by the arrows. The 
degree of correlation increases as the angle decreases and will be 
perfect positive ( -fi) correlation when the lines coincide, there 
will be zero correlation when they are at right angles and nega- 
tive correlation when the angle becomes obtuse. Any point on 
the paper represents the scores of a person in each of the tests and 
each score is given by the perpendicular distance of the point 
from one of the lines. 

The idea of zero correlation when the lines are at right angles 
(cosine go° = o) is a useful one. Sometimes factors can be 
thought of as vectors which are at right angles. They arc then 
wholly independent factors and have no common quantity or 
overlap. Instead of speaking of them as rectangular factors we 
use the Greek Orthogonal to describe them. The factors for 
which Spearman sought would thus be spoken of as orthogonal. 
Obligue factors arc those which could be represented by lines at 
an angle with one another which is less than a right angle. Most 
of the methods originated by Alexander, Thurstone and other 
recent workers use oblique factors. 

Let us represent two tests by the lines X>X and Y‘Y meeting at 
O. The cosine of angle XOY=the correlation bebveen the tests. A 
testce with average marks in both tests will be at the point O and 
other testeep will be represented by swarms of dots, like bullet holes 
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round a bull’s-eye O, with the density of dots per unit area be- 
coming smaller the further we go from O. Now the analysis 



oJ[testj‘csu^^jEauivalcrUjajd£ixiug.JJiescacst&jv!ed.oii.LQLiiS£s 
at right angles and these latter will repre sent OT tkof>onnl fartors . 
Consider the siiii^est case of two factor vectors OA and OB 
respectively bisecting the angles between the test vectors. This 
was the idea with which Hotelling hw OA and 

OB would represent his ‘ principal componen ts’. There is no 
necessity, however, for OA and OB to be placed in the position 
we have taken. They could be placed anywhere provided that 
they passed tltrough O and were at right angles (orthogonal). 
These factor vectors can be rotated to the most convenient 
position, indeed, if either OA or OB arc made to coincide with 
either OX or OY one of the factors is given by one of the test 
vectors. 

When OA bisects the angle XOY, as it does in the case we have 
given, the scores along OA clearly give the best i epresentation of 
the results of the two tests. Such a vector is known as the ‘ first 
principal componen t’. (Hotelling.) 

In ttie case of a Spearman analysis of two tests three orthogonal 
factors would be necessary, that is, a common g and two separate 
s factors. Thus his factors may be represented by three straight 
lines at right angles meeting in a point like tlmee edges of a 
rectangular box meeting at a corner. These three vectors (still 
remaining at right angles to one anoUicr) are rotated until one 
is at right angles to the first test and another is at right angles to 
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the second test. Then, g is represented by the third vector. In 
general, Spearman’s ‘two-factor’ analysis requires one more 
dimension in space than the number of tests. Again, we have to 
use the geometry »f models are of only limited 
help. 

If we wish to add a third test to those which we have represented 
by the two lines through the point O on a plane surf.'cc we shall 
have to considci three-dimensional space. \Vc shall find from 
trigonometrical tables angles who.se cosines arc the correlation 
coefficients of the third test and each of the other two respectively. 
\A'c shall then find a line going through O which makes these 
angles with tlic first two sectors. Usually we shall obtain a kind 
of ti'ipod with one of the vectors coming out of the plane of the 
paper. If the sum or the difl'eicncc of the angles which we have 
found is ex.actly equal to the angle between the two original test 
lines, the three lines will he in the plane of the paper. Again, if 
any two angles together are le.ss than a tliird angle it will be 
impossible to draw the thiid line. It will be ‘imaginary’ in the 
mathematical sense. More than three tests demand the use of 
multi-dimension<il sjiare and although this cannot be visualized, 
it is nevertheless a useful mathematical device for work with four 
or more tests. 


jVof# on Correlahon Matrices and Lines of Regression 

Consider the following correlation matrix in which x,, 
X, . . . etc. are tests of certain aptitudes: 



X, 

mm 

B 

X. 


Xn 

X, 

I 

Tot 

mm 

r„ 


Ton 

Xi 

r,i 

I 

■M 

r,. 


rit 

X, 

^01 

Ti. 

I 

r„ 


Ttn 

X, 

Tot 

r.. 


1 


Ton 

X„ 

Ton 

Tin 

r,M 

Ton 


1 
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Each of the correlation coefficients r may also be considered as 
the regression of the score in one test on that of another. In other 
words, the estimated score in one ability or aptitude is expressed 
as a linear function of the scores in a number of others x, x, 
X, . . . Xn. The regression equation becomes: 

X, = iix, + A,x, + i,x, . . . + bttXn 
where bi, b,, b, ... bm are the regression coefficients. 

It is sometimes necessary to know how far estimates made &om 
regression equations differ from the true values. 

This is given by the multiple correlation (Rm) between the 
estimates and the true values. 

Now Rm = -y/biUi + -f- + . . . b 

Those who have some knowledge of determinants will see that 
this may be expressed as 



where A is the complete correlation determinant (or matrix) 
given above and A„ is the minor determinant which is left when 
the first row and column are removed. 

Simileurly we could use the second regression equation and find 
estimates of x when is given and these errors of estimate would 
be distributed with a standard deviation: 

Ox y/ 1 — r*xy 

Here we find again the alienation (A) where k = y/ 1 — r*. 
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GHI-SQ,UARED AND CONTINGENCY 

O NE of the most useful methods of investigating the numerical 
results of educational research is the use of chi-squared x*- 
Pearson developed this at the beginning of the present 
century and in recent years it has become popular in attacking 
many problems requiring the analysis of variance. The most 
common and straightforward use of x’ that of testing the 
agreement between observed quantities and those expected in 
view of an apparently suitable hypothesis. For instance, we might 
wish to find whether a set of measures fit a normal distribution 
curve to such an extent that any discrepancies are due to errors 
of sampling and are not significant. 

If F« is a number expect^ and x is the difference between this 
and the actual number observed F (i.e. the observed number 
F = F. + x) 

then X‘ = 2 

It is obvious that in the case of perfect agreement between the 
observed and expected values x* wiU vanish and its value will be 
smaller in accordance with the closeness of agreement between 
the sets of values. Tables have been prepared which give a value 
for P, the proportion of cases in which any value of x* is exceeded. 
The tables give the relations bet%veen x’ and P, the probability 
for various values of n, which must be an integer and represents 
the number of degrees of freedom or independent variates of the observed 
classes. In educational investigations there arise many cases 
where we might wish to find whether the differences between 
theoretical or predicted values and those actually observed were 
due to chance errors of sampling or whether the differences are 
significant. The chi-square method is also useful to test the 
‘goodness of fit* of a set of given values to those represented by a 
standard curve. For example, we know from tables the values of 
the ordinates of the normal probability curve at viurious sigma 
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distances from the mid-point. We may be given a set of values to 
fit to the curve* and the ‘goodness of fit’ may be estimated by x‘- 
Again, we may wish to compare teachers’ estimates of pupils’ 
work in classes (A. B. C. D etc.) with their subsequent achieve- 
ments in examinations. Again, we may wish to compare group- 
ings or estimates with respect to one factor, quality or attainment 
with those of another. Here we use a contingency table and from 
this we may obtain a value for the probability that the differences 
are not due to chance, x’ does not normally measure correlation; 
it is really a measure of divergence rather than association. 

Example: The following table gives the theoretical frequencies 
ft and the observed frequencies f in fitting values to a normal 
curve at the given intervals Find whether the fit is good and 
whether any deviations from normal distributions are due to 
chance fluctuations. 

^ /- 


The table should be set out as follows: 


Interval 

Frequencies 

(/-/.) 

(/-/O* 

(/-/«)• 

f 

/. 

280-340 

17 

15 

2 

4 

•27 

260-280 

13 

15 

— 2 

4 

•27 

240-260 

20 

20 

0 

0 

.00 

220-240 

27 

24 

3 

9 

■38 

200-220 

23 

25 

— 2 

4 

.16 

180-200 

19 

21 

— 2 

4 

•19 

1 60-180 

15 


— 2 

4 

'23 

ioo-i6o 

23 

20 

3 

9 

•45 

Totals 

157 

157 

0 


X’ “ 1-95 


Knowing seven of the observed frequencies and the total, we 
could find the eighth. Thus, there are (8 — i) == 7 degrees of 
freedom. By consulting the Fisher or Elderton tables for 7 

» See page 75. 
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degrees of freedom and X’ = **95 we find a probability value of 
P = .96. This means that even if the function were distributed 
normally throughout all its measures, as great a discrepancy as 
we have obtained would occur in samples 96 times in 100. The 
fit is in fart better than usual for the most probable value of P 
for a true fit is .50. [If the process were repeated for many samples 
with the same mean and standard deviation the number of 
degrees of freedom would be two less, i.e. 5. The value for P in 
this case would be -84.] 

It often happens that it is necessary' to determine the degree of 
association between two sets of measures which are not normally 
distributed but are given in the form of numbers in each of a 
sei'ies of classes in both sets of measures. For instance, we may 
mark a set of Phy sics papers in four classes A. B. C.*and D without 
further distributions within each class. In the same way we may 
mark a set of Chemistry papers in four (or some other number of) 
classes of merit A. B. C. D. 

We wish to find whether there is a significant degree of associa- 
tion between the two sets. 

It is convenient to arrange the number of cases which fall into 
each group (the frequency m the group) in a cell in a square or 
rectangle. 


PHYSICS 



Total 40 
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Here we have sixteen cells or categories and each one represents 
a group in Physics and one in Chemistry so that every possible 
case is covered. The number in each cell represents the number of 
students in each category, e.g. 6 students have A marks in Physics 
and in Chemistry, 3 have a D mark in Chemistry and a C mark in 
Physics. If there were no correlation between the sets of marks 
we might expect the 10 students with A.s in Chemistry to be 
distributed in the proportion 10. 11.9. 10 in their Physics groups, 
that is to say, about equal numbers in each group. 

Suppose now that there were no relationship between the 
groups in Chemistry and those in Physics. Let us calculate how 
many students would fall into each of the 16 cells in this case. 
(Fe is the expected frequency.) 


F* for Ain Chemistry and D in Physics = 

F* for A in Chemistry and C in Physics = 

F« for A in Chemistry and B in Physics = 
and so on. 


10 X 10 
40 

10 X II 

~4o~ 
10 X 9 
40 


Now make a 4 x 4 table of these Fe.s: 



D 

C 

B 

A 


A 

2.50 

2-75 

2-25 

2.50 


B 

3-25 

3*57 

2.92 

3*25 


C 

2.25 

2.47 

2.02 

2-25 


D 

2.00 

2-20 

J.80 

2-00 









TABLE OF F«.S 
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D 

c 

B 

A 


A 

1-5 

2-75 

•75 

3-50 


B 

1.25 

1-43 


2.25 


c 

•75 

•53 

1-02 

•25 


D 

, 

2-00 .80 

i-Bo 

I. 00 


. 

1 





TABLE OF (F — Ff) 
F = actual frequency 


Note that in view of later squaring the signs are all written as 
positive. 


(F - Fe)* 

The next table gives - that is, the numbers in the last 

table were squared and divided by their respective Fe.s. 



D C 


A 


A 

.90 

2-75 


4.90 


B 

.48 

•57 

1-48 

1.85 


C 

.56 



•03 


D 

2*00 

.29 

i'8o 

.50 







1 


(F - F.)* 

TABLE OF ^ = 

F, 
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(F — F,)* 

The sum of all the - — - - numbers, 
r? 

Z(F - F,)' 

i.e. r,- = x" (clu-squarcd) — 18.98. 

r <» 


On consulting Fisher’s or Eldci ton’s tables the v.ilue of P, the 
probability for x‘ = 18.98 and 9 degrees of freedom’ is equal to 
.025. Thus tlie chances are i in 40 that the deviations of the actual 
from the expected fiequencics could be through chance errors of 
sampling. Accordingly, we have grounds for believing that there 
is a contingency or relationship between the variables. 


The Coefficient of Mean Square Contingency 
The coefficient of mean square contingency is given by 


G = 


y 


X* 

N + X* 


In the example we have worked out 


C = 



18.98 
a- 1 8-98 


= -57 


Contingency is a better measure of divergence th.in association 
and should be regarded as such. Nevertheless, if the number of 
cells used were increased and a finer grouping obtained, G would 
approach in value to that of tlie correlation only if the distribu- 
tions of both sets of measures wcic normal or nearly normal. 


A Mote on Degrees of Freedom 

Ghi-squared tables give the value of the probability P in terms 
of X* and the number of degrees of freedom. This number is not 
usually equal to the number of cells in the contingency table or 
the number of cases, but is usually one less. Nevertheless, as 
R. A. Fisher has shown, the number of degrees of freedom, when 
the marginal totals remain the same sample after sample, will be 
(c — i)(r — i) where c is the number of columns and r is the 

‘ See below. 
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number of rows. We have to ask ourselves how many cells could 
be filled in from prior knowledge and subtract this from the total 
number of cells in order to obtain the number of degrees of 
freedom; e.g. if we have a 4 x 4 table and can assume that the 
marginal totals remain fi-iced we should be able to compute the 
fourth row or column in each case knowing the three others. 

The number of degrees of freedom is therefore 

( 4 - i)(4- ») =9- 


Note on Student's 

Student’s U' is defined as 



Ox 


where x is the deviation of a measure from the true value which 
is assumed from a normal distribution and Ox is the standard 
deviation of all the measures in the sample. Student worked out 
the distribution of t (which he originally called z) and found that 
it was particularly useful for woiking with small samples. At first 
Student cairied his table only to N = 10 and found that the 

standard error of his distribution was — - ; --- ■ and later Fisher 

VN- 3 

developed the table in terms of N — i degrees of freedom. Most 
of Fisher’s tables are constructed so that a probability of 5% 
(odds of 20 to I ) is significant and a probability of i % is highly 
significant. In the case of a normal distribution (n very large) 
probability of 5% corresponds to a < of 1-96 and a probability' of 
I % corresponds to a 1 of 2'58. 


' ‘Student’, whose real name was WUbam Sealy Gosset, died in 1937 He was a 
senior member of the brewing hem of Guinness m whose ser\ ice he developed much 
of his statiatical work He chose his pseudonym out of respect for the ‘master’ 
Karl Pearson. 
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THE ANALYSIS OF VARIANCE 

S TANDARD deviation has proved so usehil as a measure of 
dispersion, as a step to correlation, factor analysis and the 
use of the normal curve that the more recent and often 
more useful technique of the analysis of variance has tended to be 
overlooked. It is possible that the influence of Spearman, who 
made such great use of correlation coefficient in his technique of 
factor analysis did something to hinder the development of the 
more widespread use of the analysis of variance. * 

Variance may be regarded as the square of the standard deviation 



where N is the number of measures and d is the deviation of a 
measure from the mean of all the measures. 

(If the measures have been standardized by arranging them as 
deviations from their mean and dividing them by the standard 
deviation the S.D. is therefore the unit of measurement, i.e. 
S.D. = 1 and V = i.) 

If we regard the mean as the first moment about the point from 

’ As has already been noted the psychologist of a generation ago borrowed some- 
thing of the terminology and technique of the Galton-Pearson school of bio- 
metricians. In recent tunes the work of Professor R. A. Fisher, formerly of the 
Rothamsted Experimental Station, in statistics chiefly concerned with agriculture 
and other biological investigations has been adapted to psychological needs, particu- 
larly by Sir Cyril Burt in this country. The most valuable aspects of Fisher’s work 
for our purposes are (o) bis methods of designing experiments so that the results 
shall be susceptible to simple statistical treatment (A) the analysis of variance. 
Details of his methods (with particular reference to agnculture) are to be found in 
StatuUcal Methods for Research Workers and Design ^ Experiments. Burt’s exposi- 
tions have a simplicity and clarity not always to be mund in these treatises. 


lao 
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which the mean is measured the variance of a distribution may 
be defined as the second moment about the mean: 

where * is a score or measure 

and x is the mean of the whole distribution. 

Variance as a measure of variability has an advantage because 
it is additive, that is, the total variance of a set of measurements 
may be regarded as the sum of the independent parts or ‘factors’ 
which combine to make up the variance. 

Ox' = + CT/i* + Uf* + . . . etc. 

if X = a -h b + c. 

In the analysis of variance the piocess is reversed and the total 
variance is bioken down into those of the several components. 
One of these variances will obviously be due to error in measure- 
ment and usually will be taken to consist of random errors due to 
the smallness of the size of the sample which has been used for the 
investigation. The most frequent ami useful abbhcaiion of the analysi s 
of variance is to compare the significance of the variance due to some 
particular factor w ith the amount oj variance due to error, 

(It will be recall^ that in factor 'anaT^s the factors have to be 
discovered in the process of the analysis and their relative amounts 
estimated. In the analysis of variance the possible factors are 
assumed by reference to the given data and the problem is to 
establish their relative significance, that is, to find what is the 
probability that the variance due to each factor is to be accounted 
for as. am effect of pure chance. In factor analysis we try to 
determine the relative importance of the inferred factors.) 

Let us consider a set of marks (jc) which have been correlated 
with another set [y). Were all the individuals in the x column to 
have tlie same value there would still remain some scatter in the 
y column, that is, when x is constant there is yet some variability 
in the^ scores. When there is correlation between the x and^ 

Cc* 

values the variability expressed as a ratio is — . As this is the 


I 
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proportion of the variance (a*) remaining when x is constant it 
may be considered the proportion of the variance in_y attributable 
to factors inj other than x. Conversely, the reduction in variance 
when X is kept constant is the part of the variance due to x factor. 
In terms of the entire variance ofj> the ratio is 


O-v* — Oc 


ay* 


a.’ 

ay‘ 


But r 




Oc* 

ay’ 


Therefore r* 


a, ' 

ay’ 


at’ — ffc 


Accordingly the total variance may' be divided into two parts 
of which the proportion due to what is common to x and_)> is equal 
to r’, and the proportion due to the other factors is 


Oc* 

ay’ 


t — 


r’ 


r’ is known as the coefficient of determination. 

[The above is true when correlation is linear and the line of 
regression is straight. Nevertheless, a similar relationship exists 
when the correlation is not linear and the correlation ratio q 
(eta) is used. In tliis case, the proportion of variance of y is 

separable into two parts: that due to x is -- = t)* and that due 

ay’ 

CTc* 

to the other factors — = i — q*.] 

Oy’ 


In the analysis of variance the easiest way is to consider the 
average for each class implied by the factor. As, for example, we 
might require to find whether on the average males or females 
are more intelligent. All we have to do is to find the respective 
means of intelligence-test scores and to determine whdther the 
dififcrence between the two means can be attributed to the efiTects 
of random sampling. Here the classification is dichotomous but 
if we have to consider, in addition to sex, differences arising from 
race or school, we should have multiple classification and should 
have to compare a number of means all derived from the same 
principle of classification. 
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Thus, it is useful in the case of the simple sex classification to 
find the standard error of the difference between the two averages, 
for this will tell us whether the difference is significant or 
attributable to chance errors of sampling. 


S.E. of a difference of means = 



N, and N, are the numbers in each of the two sets respectively 
and Oj and a, are their standard deviations: 


P.E. 


= *6745 



The S.E. divided into the difference between the averages 
should give a quotient of at least 3, though if it were above 2 it 
might be worth while continuing the investigation. 

Other standard errors which are useful in educational research 
are as follows: 

Standard error of a difference between the averages of scores 
which are intercoirelatcd. If we wish to consider the significance 
of the difference between the averages of scores in two tests or in 
repeated tests taken by a single set of persons 



or if a, and <jj are taken to represent the S.E.s of the means of the 
original scores and not the S.D.s of the original scores 

On = •y/o,* + o,’ — 2ra,a, 

In view of the differences which arise through errors of sampling 
the average of a sample may vary from the true average which 
would be found if we were able to take a very large number. 


The S.E. of the mean or average = 

where a is the standard deviation of the original sample. 

In the same way differences in the nature of samples (‘errors of 
sampling’) may cause errors in the S.D.s of a sample. 


VN 
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The standard error of a standard deviation a„ = — — 

The standard error of a diflei'ence between two standard 
deviations is equal to 



where a, and o* are the standard deviations of N, and N, are 
the number of cases in the respective groups oi sets 

Standard error of a percentage and of a difference between percentages. 

If X is the percentage then 

Standard Error of ^ = J A 

and the stand<ird error of a difference between two peicentages 
X, and Xi is 



The formulae arc most useful in finding the numbers of cases 
which it is necessary to investigate in order to be certain that 
percentage differences between groups are significant, e.g. It 
appears from dental records that 40% girls and 43% boys at 
certain schools arc in need of dental treatment. What is the 
minimum of children which we must take in order to make sure 
that the 3% difference is significant? 

If the difference of 3 % is reliable it should be more than 3 times 
its S.E. 

S.E. should notjbejgreatcr than i % 

. ,= / 40 X 60 43 X 57 

V N N ■ 

.-. N - 4851 



ANALYSIS OF VARIANCE 125 

Thus to make sure th:it the 3% diflFercnce is significant the 
investigation should be based on the examination of 4851 (say 
5000) boys and an equal number of girls. 


Problem 


A test has been applied to five arts students and five science 
students. The marks obtained aic given below. The average for 
the aits students is 3 marks more than that of the science students. 
With this small sample is this difieiencc likely to be a matter of 
chance or is it safe to assume that arts students are better on the 
average? 


Artf Students 
Drvia- 

J^'ame Marks tion 

Square 

Xame 

Science Students 
Dcvia- 
Marks tion 

Square 

Cowper 

21 

t I 

I 

Maxwell 

19 

+ a 

i 

Shaw 

>9 

— I 

I 

Faraday 

14 

- 3 

9 

Scott 

18 

— 2 

4 

Darwin 

18 

+ I 

I 

Stewart 

23 

+ 3 

9 

Dale 

15 

— 2 

4 

Lamb 

19 

— r 

I 

Newton 

J 9 

+ 2 

4 

Totals 5) 

lOO 

0 

16 


5)85 

0 

22 

Mean 

20 

V 


Mean 

17 




A c 20 +- 17 - 

Average ol means — — = i8'5 


Deviation -f- 1-5 


Deviation — 1.5 


To obtain the standard deviation we divide not by the number 
of each set of cases but by the number of degrees of freedom. This is 
an important conception in statistical analysis. In each column 
there are 5 deviations from a mean calculated from the given 
data. But the total of all the 5 detdations must be zero, and thus 
if we know 4 deviations we can at once calculate the stli. Accor- 
dingly there are 4 degrees of freedom, i.e. only 4 deviations are 
independent. 
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Thus the standard deviation of the individuals in the sample is 

/ Tx* liO 22 738 

^ \ rti + Ji, — 2 V 8 V 8 

= V4-75 = 2-179 

and the standard deviation of tlic difference is 

The critical ratio t is given by 

meant— mean, ao— 17 3 

- ^ - -= 2.176 

(Jd 1.37G 1.376 

On consulting Yule and Kendall’s ‘/-table’ we find that for 
8 degrees of freedom the probability of obtaining a diffeiencc as 
large as this is P = 2(1 — -97) = -06 or 6%. The probability of 
getting a difference as large as this by chance is 6 to too, that is, 
the odds against getting a difference as large as this by chance 
are about 15 to i. The difference cannot therefore be accepted 
as really significant. 

Instead of comparing the difference between the means with 
a standard deviation derived from the individual measurements 
we can compare the variance of the means with a variance based 
on the original measurements. 

Firstly, let us reduce all the given marks to deviations about the 
general mean. This is 

+J5 _ ,8 

10 

Then deviation of Art Students mean from General Mean = 1.5 

„ ,, Science ,, ,, ,, ,, — ~ 1.5 

Now split the marks for each student into three components: 

(i) the general average; (2) the dcvi.ition of his group mean; 
(3) his individual deviation above or beloiv the sum of tlie two 
means. 

Thus Cowper’s mark is 21 = 18.5 -j- 1.5 -H 1.0. 
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MARKS ANALYSED IN DEVIATIONS OF MEANS AND INDIVIDUALS 
Deviations oj Deviations of Total Deviation from 

Means Individuals General Mean 


la 

\h 

2a 

ab 

3a 

3 * 

Arts 

Science 

Arts 

Science 

Arts 

Science 

+ 1-5 

- 1-5 

— i-o 

+ 2-0 

-f 2-5 

+ 0-5 

-1 1-5 

- 1-5 

— I-O 

- 3-0 

+ 0.5 

- 4-5 

+ 1-5 

- 1-5 

— 2.0 

-L I-O 

- 0.5 

- 0.5 

4 1-5 

- «-5 

3-0 

— 2-0 

+ 4-5 

- 3-5 

+ 1-5 

- 1-5 

— I-o 

■ f - 2-0 

-f 0.5 

+ 0-5 



SQl ARES OF THE ABOVE 


2.25 

2.25 

I.OO 

4-00 

6.25 

0.25 

2.25 

2.25 

I-OO 

9.00 

0.25 

20.25 

2.25 

2-25 

4.00 

I-OO 

0.25 

0.25 

2.25 

2-25 

9 00 

4-00 

20.25 

12.25 

2.25 

2.25 

1-00 

4-00 

0.25 

0-25 

11.25 

11.25 

1 6-00 

22-00 

27.25 

33-25 

t 

Total 22 

^ * 
■jO 


38.00 

60 

Y 

.50 


C M.CULATION 

or MEAN SQ.UARES 




Degrees of Sums of 

Mean 

Source of Variation 

Freedom Squares 

Square 

Between Groups 

2 — I 

= I 22 . 

50 

22.50 

Within Groups 

10—2 

= 8 38-00 

4-75 

Total 


10 — I 

= 9 60. 

50 

(6.72) 


F 


VARIANCE-RATIOS, OBSERVED AND EXPECTED 
Degiees of 

Observed Freedom Expected 


4-75 


= 4-737 


I and 8 


5-32 
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The deviation of the mean and the deviation of the individual 
arc given in columns la. ib and aa. respectively. It will be 
seen that these add up to the deviation about the general mean 
given in columns 3a and 36. Further, in the following table it will 
be seen that the tot.ils of the squares of mean and of individual 
deviations add up to the toLil of the squares of the deviation from 
the general mean. 

To obtain the ‘mean-squares’ or ‘variances’ we divide each of 
the three .square sums by the corresponding degrees of freedom. 
There are 2 deviations for the 2 means, but as these are calculated 
from the general mean of the data one degree of freedom has been 
lost. There arc 5 deviations about the me.in for aits students and 
5 about the mean for science students, and each set of these is 
calculated from the mean of its group. Hence the number of 
degrees of freedom is (5 — t + 5 — i) = (10 — 2) — 8. As there 
are 10 individual deviations about the general mean these give 
(to — t) =9 degrees of freedom. 

In the table .showing the variance or me.m square note that the 
column of degrees of freedom adds up to the degrees of freedom 
of the whole group, and the square sums for the two components 
add up to the square sum of the entire group and this provides 
a useful check. 

As we analyse the total sum of the variances and not the total 
variance, the variances do not add up to the total variance. We 
now proceed to test the variance between the means of the two 
groups. (If the variance to be tested is due solely to error, then 
it should be equal to the error-variance. Hence to test the former 
we divide by the latter.) The variance of the individuals within 
the group, taken from the mean of cither group, is treated as 
denoting the error variance. The probabilities corresponding to 
various values of the error variance F can be found in Fisher’s or 
Sncdecor’s tables, and as before 35% probability may be taken 
as marking the borderline for significance. The table gives 4‘737 
in this case which is less than the borderline value. Again, by 
this method we conclude that the difference between the two 
means cannot be regarded as fully significant. 

In the case under consideration ¥ t* (and we note that 
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•y/F = \/4'737 = 2.176 which was the value previously obtained 
for t). 

Testing the Significance of the Differences between Several Mentis^ 

Where the criterion of classification gives two classes only it is 
adequate to test the difference betsv'een the tsvo means by the 
standard error of the difference, that is, by the t-ratio. When we 
have three or more classes it is necessary to use methods involving 
the variance or F-ratio. Suppose that instead of considering the 
abilities of students in onK two faculties of a university, we have 
to make a comparison of students in all the faculties. Suppose, 
for simplicity, we consider three faculties only and that the test 
results are as follows: 


MARKS KOR ARTS, SCIENCE AND MEDICAL STUDENTS 



Arts 

Science 

Medicine 




Mark 

Matk 

Mark 

Dev. 

Square 


21 

19 

18 

-I 2 

4 


19 

14 

16 

0 

0 


18 

18 

I.') 

— I 

I 


23 

L'j 

n 

■+■ I 

I 


>9 

>9 

14 

— 2 

4 

Total 

5)100 

5)05 

5)80 


10 

Average 

20 

17 

16 



Deviation 

+ 2-3 

— 0.6 

- 1.6 



Square 

5-4 

0.4 

2-7 




It is unnecessary to repeat the deviations and squares for arts and 
science students. It is also unnecessary to repeat the means, etc., 
for every person tested. We have simp'y to multiply the square 
of each mean by 5 (the number of individuals) and then take the 
sum; or more simply to sum the squares first (5.4 + 0-4 + 2.7 
= 8.6) and then multiply the sum by 5. We obtain 5 X 8.6 
= 43 - 3 . 

* I un indebted to Sir Cyril Burt for the treatment of this problem and for the 
subsequent account, taken irom his laboratory notes, of his adaptation of Fisher’s 
methods. 
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The sums of tlie squares of tlie individual deviations within each 
of the three groups (calculating from the corresponding group 
mean) are 16 + 22 4- 10 = 48. The square-sums for the 15 
deviations for the general mean (17-6) need not be calculated, 
except as a check. 

Tabulating the results as before, we obtain the mean squares 
as follows: 


CALCULATION OF MEAN SQUARES 


Source of 

Degree of 

Sum of 

Mean 

Variation 

Freedom 

Squares 

Squares 

Between Groups 

3—1=2 

43-3 

21-6 

Within Groups 

15 - 3 = 12 

48.0 

4.00 

Total 

15 - I = 14 

91-3 



VARIANCE RATIOS, OBSERVED AND EXPECTED 

Observed Degree o f Freedom Expected 

F 5= £.L‘^ — =,42 2 and 12 3.88 

The ratio of the two variances is now 5-42, well above the value 
we should expect with 2 and 12 degrees of freedom. Thus there 
can now be little doubt that the difference of faculty does after 
all tend to produce slight but genuine differences in the average 
marks obtained by the test. 

For purposes of illustration we have taken tiny samples with 
5 individuals in each. But the numbers in each sample need not 
be the same, and indeed may be so large that the sums of squares 
are best calculated from grouped frequencies. With continuous 
variates it is then better not to use Sheppard’s correction but to 
keep the grouping fine. 

The method may be conveniently used to test the significance 
of the correlation ratio. Treating the groups as ‘arrays’ in a 
correlation-table, we have 

Sum of Squares between Groups 43.3 
^ Total Sum of Squares 81.3 533 . 

Hence q == *730 (by consulting Yule and Kendall table, p. 454). 
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Two Criteria of Classification 


Testing the Significance of a Difference betiveen two Means 
Problem'. Consider the marks allotted to the four pupils as 


follows: 

Tom 

Dirk 

Harry 

George 

Total 

A ^ 

Average 

Arithmetic 

29 

24 

14 

I 

68 

17 

English 

29 

28 

15 

4 

76 

19 

Drawing 

32 

27 

27 

22 

108 

27 

Handwork 

34 

29 

28 

! 

25 1 

1 16 

29 

Total 

124 

108 

84 

52 

i 368 

92 

Average 

31 

27 

21 

13 

i 92 

23 


Take, to begin with, two pupils only. The average mark allotted 
to Tom is 31, to Dick 27. Can w'e safely infer from this that Tom’s 
general ability is significantly greater than Dick’s, or (since we 
have used only 4 tests) is it more likely that the difference results 
solely from chance? 

1st Method: Standard Error of the Mean Difference 
As before, the most obvious procedure is to calculate the 
standard error of the difference by the usual foimula. 


CALCt I ATIO.N OF STANDARD ERROR OF DIFFERENCE 



X 

2 

3 

4 

5 

Test 

Tom 

Dick 

1 m \ 

! Dev. 

Squares 

Aritlirnctic 

29 

24 

i + 5 ! 

! -f I 

I 

English 

29 

28 

1 H I 

- 3 

9 

Drawing 

32 

27 

+ 5 

4 - I 

1 

Handwork 

34 

29 

+ 5 

+ 1 

1 

Total 

124 

108 

1 +16 

0 

12 

Aversige 

31 

27 

+ 4 

0 



Since Tom’s and Dick’s marks may be correlated, it is simpler to 
calculate the detailed differences instead of the S.D.s of the marks 
observed and their correlation. The calculation is shown in the 
first 3 columns of the last table. 

The deviations of the differences about the mean difference 
(+4) are given in column 5. As usual, to find their standard 
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deviation we add the squares of the deviations (column 4), but 
we divide by the number of degrees of freedom. When we started 
there were n = 4 items, and therefore 4 ‘degrees of freedom’ (i.e. 
4 figures that vary independently). But in taking deviations about 
a mean calculated from the observed data, wc have lost one degree 
of freedom; for, when we know the first 3 deviations (or any 3 
deviations), we can fill in the 4th from the fact that the total 
must be o. 

Hence to find the ‘mean square’ wc divide, not by 4 but by 3. 
This mean square (12 - 3 = 4) is the ‘variance’ of the individual 
differences: and its square root (2) would be their standard 
deviation. 

But w'e require the standard deviation of the mean difference. 
To obtain the variance of a mean, wc divide the v'aiiance of the 
individuals by the number of individuals. We then obtain 
4 _ 4 = I. The square root of tliis gives the standard deviation. 
In the absence of any other information we mu.st take the st.indaid 
deviation of the mean difference thus calcuhitcd, as the best 
indication of the standard «rror of. the mean difference. Accord- 
ingly, to test the significance of the mean difference («) we 
divide it by its standard deviation. Using the /-ratio as before, 
we obtain 



(where Vm = n{n — i) } ). 

From the /-table given by Yule and Kendall (p. 536) we find 
that, with 3 degrees of freedom, a value of / = 4 gives = -986. 
Thus, the chance of getting a difierence so large as this (in either 
direction) would be P = 2 (i — -986) = -028 or 35 to t against. 

The method indicated abov'c has certain limitations although 
it suffices for the actual problem which is given. We may desire 
to test the significance of differences not only between two pupils 
but between all the pupils in the class, but it would involve a great 
deal of work to prepare every pair of pupils by the method given. 
Even if we did ^is the general picture would still not be clear, as 
it is impossible to draw the general inference from the pairs 
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considered severally. We need a more comprehensive analysis of 
all the data which has been given. This is given by a general 
method of analy.si.s of variance on the following lines: 

The 8 observed marks set out in columns i and 2 are formed by 
the de\iations of 8 performances about the average performance 
of both boys in aU four tests (i.e. about the average mark of 29). 
The 8 deviations arc given in columns 3 and 4. Instead of measur- 
ing die total amount of deviation by the sum of the 8 deviations 
(which \%ould be zero unless we ignore the signs) we can measure 
it by the sum of the squares of those deviations. The squares are 
given in columns 7 and 8. 


CALCl L.\TION ai TOTAL VARIANCE FOR TWO BOYS 


Mean 


Der’iattons 


Squares 


Squares of 



Tam 

Duk 

Tom 

Dick 

1 of Means 

iSeviattons 

Test 

I 

2 

3 

4 

5 

6 

7 8 

.'\ritliineuc 

29 

29 

o 

-s ! 

' 841 

841 

0 25 

English 

29 

29 

0 

- 1 

1 841 

841 

0 ] 

Di-auinR 

29 

29 

3 

- 2 

841 

841 

9 4 

Handwork 

29 

29 

5 

0 

841 

841 

25 0 



j 

\ 

1 




Total 


1 

i 0 

i 3364 

3364 

1 34 30 







■y .■■■■—/ 



6728 


64 


679* 


MEANS AND DEVIATIONS 



Means 

Tom Duk 

Means of 
Boys 

Tom Dick 

Means of 
Tests 

Tom Dick 

Deviations 
Tom Dick 

Totals 
Tom Duk 


la 

th 

2a 

2b 

3 « 36 

4a 

4 * 

Sa 

5b 

Arithmetic 

29 

29 

+ 2 

— 2 

— 25 - 25 

+ o-S — 0-5 

29 

24 

English 

29 

29 

-f a 

— 2 

_ 0.5 - 0 5 

- 1 5 + l.S 

29 

28 

Drawing 

29 

29 

2 

— 2 

+ 05 +05 

+ 05 - 0.5 

32 

27 

Handwork 

29 

29 

+ a 

— 2 

+ 2.5 + 2.5 

+ 0.5 - 0 5 

34 

*9 

Test 

la 

ib 

SJJltARES 
20 26 

OF ABOVE 

30 36 

4a 

46 

SO 

Sb 

Arithmetic 

841 

841 

4 

4 

6.25 6.25 

0.25 

0.25 


576 

English 

841 

841 

4 

4 

0.25 0.25 

2.25 

2.25 

841 

7«4 

Drawing 

841 

841 

4 

4 

0.2$ 0 2$ 

0.2s 

025 

1024 

729 

Handwork 

841 

841 

4 

_ 4 _ 

6 25 6.25 

0 25 

0.2s 

1150 

841 

Total 

3364 

3364 

16 

)6 

13.00 13.00 

3.00 

3.00 


2930 


6708 


3 » 


36 


679a 


6 
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Components 

Our task is now to analyse these gi'oss deviations into their chief 
components. Each deviation may be regarded as the sum of 
3 deviations: (i) tlie mean deviation of the particular boy above 
or below the general mean (29); (ii) the mean deviation of the 
particular test above or below tire general mean; (iii) the indivi- 
dual deviation of each mark from tlie sum of these two means. 
This subdivision is shown in the table of Means and Deviations. 
Observe that, in combination with the general mean, the three 
figures add up to the original marks, appended in the last two 
columns. 

We now square all these figures and enter them in the Table of 
Squares where they are analysed. We notice that the component 
sums at the bottom of the table add to the grand total (6792). 

We aie not concerned widi the squ<ares of the geneial mean 
(6728). What interests us is the partition of the sum of the square 
of the unanalysed deviations (64) into the sum of the sums of 
the squares of the three components. W’e observe that 
64 = 32 -f q6 -b 6 

The Variances 

We can now proceed to test the significance, not only of the 
variance due to the differing means of the 2 boys, but also of the 
variance due to the differing means of the 4 subjects. As before, 
what we shall test is not the differences between the means, but 
the total variance of the means. The sums of squares and the 
degrees of freedom by which we divide them are tabulated in the 
first two columns of the table. The result of the division is given 
in the last column. 

ANALYSIS OF VARIANCE: (tWO BOYs) 

Source of Sum of Degrees of Mean 

Variation Squares Freedom Square 

Boys 32 2—1 = 1 32 

Tests 26 4 — * = 3 8-6 

Error 6 4 - t = 3 a . 

64 8-1 = 7 (9'*4) 


Total 
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Degrees of Freedom 

Since the deviations of the 2 boys’ means and the deviations of 
the 4 test means are calculated about the general mean, we must 
deduct one degree of freedom from each. The same is true of the 
det'iations of the 8 performances: but this we only need as a check. 
The boys’ variance and tlie test variance are the variances to be 
tested, and so form tlie numerator of the variance-ratio. And 
since a variance, not a difference, is being tested, we require for 
the denominator, not the standard error, but the error variance. 
The only part of the data that we can use to indicate the error 
variance will be the deviations of the 8 performances from the 
sum of the means, i.e. the deviations shown in columns 4a and 46. 
There are 8 figures; but in calculating these figures from the 
original 8 marks we have already used 5 degrees of freedom ( i for 
the general mean; 2 — i = i for the boys’ means; and 4—1=3 
for tlie test means). Hence only 3 degrees of freedom are left. It 
is easy to see that, if we take any 3 figures in columns 4a and 4ft 
say -f- 0.5, — 1 .5, -f 0.5, we can deduce tlie other 5, because we 
know that the sums of both columns and ro\ss must all be zero. 

Significance Test 

To test significance, we now take the ratios of the variance of 
the boys’ means, and then of the test means, to the variance due 
to ‘error’. 

VARIANCE RATIOS (f), OBSERVED AND EXPECTED 


Source 

Ratio 

Observed 

Degrees of 
Freedom 

Expected 

Boys 

F* 

3? = 16 

2 

I and 3 

lO-I 

Tests 

F, 

8-6 

■7 =4-3 

3 and 3 

9-3 


Thus the difference between the tw'o boys is fully significant, but 
the differences between the tests (applied to only two pupils in 
this part of the inquiry) is not significant. 
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Relation between the Two Alternative Methods 
Since, with i and 3 degrees of freedom, an F-ratio of lo.i 
represents P = 0.05, wc might guess, by rough interpolation, that 
an F-ratio of 16 would represent P = 0-03 or thereabouts (the 
value obtained with the first metJtod). In fact, we note as before 
that F — <*, for F = 4 and < = 2. 


Testing Reliability 

There is no reason why the two columns of obser\'cd figures, 
like tliosc set out in columns i and 2 above, should always 
represent persons, or the rows should always rcpiescnt tests. For 
example, if we had applied two tests to four (or more) persons, 
then the headings 'Tom’ and ‘Dick’ would be altered to ‘1st Test’, 
‘2nd Test’; and the side-titles would be the names of the persons 
tested instead of names of school subjects. This is the form the 
data take when wc wish to test the reliability of two successive 
applications of the same test. The two means of the columns will 
now represent difficulty of tests, or possibly tlic improvement 
shown in the second test as a result of practice or familiarity with 
the first; and the means of the pair of marks in each row tlic 
average ability of the boys tested. Unless the averages for the 
boys differ significantly, the test is fiiiling to differentiate between 
the several tested, and so is devoid of reliability. The usual 
measure of the amount of reliability is, of course, the correlation 
between the two columns. 


Testing the Significance of the Differences between several Means 
Problem 

The advantages of the second procedure are most evident where 
we desire to test the significance of the differences between the 
means, not for tw'o boys only, but for several — say four. As 
before w'c can at the same time test the significance of the differ- 
ences between tlie means for the four school subjects. Subtracting 
the general mean (23) from the figures in the table for four boys 
we J^ve 



ANALYSIS OF VARIANCE 


137 


DEVlATlbNS SQUARES OF DEVIATIONS 


Test 

Tom Dirk Harry 

George 

Total 

Mean 

1 Tom Dick Harry George 

\Total 

Arithmetic 

4 6 + I — Q 

~ 22 

— 24 

- 6 

i 36 

I 

81 

484 

6o2 

l^nttlish 

+ 6+5-8 

- 19 

— 16 

- 4 

1 36 

25 

64 

361 

486 

DrawinR 

-t tj + 4 + 4 

“ t 

1' 16 

4 

i 

x6 

16 

I 

1 14 

Handwork 

+ 1 1 -t- 6 + s 

j- 2 

+ 24 1 

1 -+ 6 

121 

36 

25 

4 

186 

Total 

+ 32 + 16—8 

- 40 

0 

1 0 

274 

78 

186 

850 

1388 

Mean 

+ 8 + 4-2 

— 10 

0 

i ° 







Components 

We now analyse these deviations into the same three com- 
ponents as before, namely (i) the mean deviation of each boy; 
(ii) the mean deviation of each lest, (lii) the deviation of each of 
the 8 perfoimanccs fioni the sum of the two means. These are 
sliown in the first table below. The reader should check the fact 
that for each pci form ance the three components add up to the 
deviation shown above. 

The squ.ires of these deviations follow: 

ANALYSIS OF DEVIATIONS SQUARES 

Tom Dull Harry George Tom Dick Harry George 



(1) Dee latioiis Jor Boys 

( 1 ) Squares of Deviations 

Total 

Arithmetic 

T 8 -14 — 2 — 10 

64 

16 

4 

100 1 

184 

Engliah 

4 8 +4 — 2 — to 

64 

16 

4 

lOO 

184 

Drawing 

48 4-4 — 2 — 10 

64 

16 

4 

100 

184 

Handwork 

-1-8 4-4 — 2 — 10 

64 

16 

4 

100 

184 


Square Sum 





25b 

64 

16 

400 

73<> 


(2) Deviations for Tests 

(2) Squares of Deviations 

Total 

Arithmetic 

- 6 

- 6 

- 6 

- 6 

36 

38 

38 

35 

»44 

English 

- 4 

- 4 

— 4 

- 4 

16 

lb 

x6 

16 

64 

Drawing 

-f 4 

4 

+ 4 

+ 4 

16 

x6 

16 

16 

64 

Handwork 

+ 6 

+ 6 

+ 6 

+ 6 

36 

36 

36 

38 

144 

Square Sum 





104 

104 

X04 

t04 

416 

(3) Deviations for Performances 

(3) Squares of Deviations ] 

Total 

Arithmetic 

+ 4 

4- 3 

— 1 

— 6 

16 

9 

I 

36 

62 

English 

+ a 

4- 5 

~ 2 

- 5 

4 

25 

4 

as 

S8 

Drawing 

— 3 

— 4 

4 2 

4- ,5 

9 

16 

4 

25 

S4 

Handwork 

- 3 

— 4 

+ 1 

+ 6 

9 

16 

I 

36 

<>2 

Square Sum 





38 

66 

10 

122 

236 


K 
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Error 

Provisionally we shall tr«it the four tests as random (and 
therefore uncorrelated) specimens of tests for ‘general ability’: 
that would mean that we can take the last set of deviations (the 
residuals) as due to ‘error’. Strictly tliis assumption should be 
tested first of all: and in fact we shall presently see that it is not 
tenable. But for the present we are concerned only to illustrate 
the procedure for simple cases first. 

Degrees of Freedom 

The degrees of freedom are ealculated as before. The easiest 
way to decide the degrees of freedom for the ‘error vaiiince’ is to 
subtract from tlie total degrees (15I the degrees lor the other two 
items (3 + 3=6); that is Cfiuivalent to subtracting fioni the 
total number (i6j the number of constants used to calculate the 
deviations for error (i + 3 +3 = 7)- 

We can now tabulate the calculations foi the mean squares (or 
‘variances’) in the same way as before. 

ANALYSIS OF VARIANCE: (FOUR BOYS) 


Source of 

Sum of 

Degrees oj 

Mian 

Variation 

Squares 

Freedom 

Square 

Bo)s 

73G 

4 - 1=3 

245-3 

Tests 

416 

4 - I =-' 3 

138.G 

Residual 

236 

i6 - 7 = 9 

20-2 

Total 

1388 

16 _ I 15 

(92-5) 


Signijicance Test 

The variance ratios are calculated as before. 

VARIANCE RATIOS (f), OBSERVED AND EXPECTED 

Degrees of 


Source 

Ratio 

Observed 

Freedom 

Expected 

Boys 

Fs 

If -5-3® 

3 and 9 

8.8 

Tests 

F, 

138-6 _ 

= 5-28 

26.2 

3 and 9 

8.8 
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The degrees of freedom are now larger than before because we 
have taken 4 boys instead of only 2. And once again the differ- 
ences between the 4 boys appear to be fully significant, but (with 
error assessed as above) the differences between the 4 tests arc not 
significant. 

Testing Reliability 

Suppose that Tom, Dick, Harry and George are the names of 
four examiners marking test performances by four boys in the 
same subject. Thus the names of tlae rows down the left-hand 
maigin of the table are names of candidates taking the tests. We 
can now use tlic analysis of vaiiance to measure the reliability or 
sclf-consistcncy of the whole examination. We could vary this by 
making the headings of the columns four component tests instead 
of four different examiners. The reliability coefficient is given by 


P- E 



where P is the mean square for pupUs or candidates and E is the 
mean square for error based on the residuals.* 

Testing the Significance of Group Factors (Interaction) 

Problem 

The foregoing are the simplest and commonest types of case in 
which the anal) sis of variance can be applied. We now proceed 
to introduce a further complicvition. 

In estimating the variance for error, we assumed that the 
deviations of the 8 peiformances from the combined means of 
boy and test were random deviations. A glance at the figures 
headed ‘deviations for performances’ on page 137 is sufficient to 
show that they are not random, but correlated. We must there- 
fore treat them as containing yet anotlicr component — a bipolar 
component. This is technically termed interaction, because the 
type of boy tested ‘interacts’ with the type of test used, i.e. an 

* This 18 developed by Burt in The British Journal of Educatiorud Psychology, XV, 
pages 80-93. Tlie use of factor analysis for a similar purpose is given in Burt, Marks 
of Exandntrt. 
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academic type of boy does well in the academic t^-pe of test, 
whether Arithmetic or English and, by comparison, badly in the 
practical type of test; conversely for the practical type of boy. 

This bipolar component we can assess by averaging the devia- 
tions in each column, reversing the signs of the last two to prev'ent 
the totals adding up to zero. We then calculate the deviations 
about these fuither averages. Thus the variance of the deviations 
for performances can itself be analysed along the same lines as 
before. 


(4) Devxatiom for Bxpoiar Cnmponrfil 

(4I Sguores of Dn'iaticms 


Arithmetic 

-■t 3 

’-4 t 5 - S S 

<> 


2 2S 

30 25 

S 7 S 

Enttli^^h 

+ 3 

-i - 4 - I 5 -- 5 S 

0 

16 

2 2S 

25 

S 7 S 

Dra» mu 

" 3 

- 4 < I ■ S-S 

y 

if> 

2 2^; 

30 25 

57 S 

Handwork 

- 3 

— 4 -t « S 5 5 

0 

16 

2 25 

3 ® 25 

57 5 

Square Sum 

- 

. - - 

36 

64 

') 

121 2500 


(5) Deflations for Error 






Arithmetic 

I 

— I -1-05 -OS 

I 

1 

0 25 

0 25 

* 5 

Eni;liah 

1 

-* 1 - 0 5 0 5 

t 

1 

0.25 

0 25 

* 5 

Drawing 

0 

0 -i 0.5 - - 0 5 

0 

0 

0 25 

0.25 

0 5 

Handwork 

0 

0 - - 0 5 -T- 0 5 

0 

0 

0 25 

0 25 

05 

Square Sum 



2 

2 

t 

1 

6 


The degrees of freedom for the ‘bipolar component’ will evi- 
dently be 3; and those for the ‘deviations for erroi’ will evidently 
be 6. W'e have thus split w hat we previously assumed to be ‘error’ 
into two components. Note that both the square-sum and the 
degrees of freedom now obtained add up to those previously 
assigned to ‘error’ in the table of tlie analysis of variance for four 
boys. 

We must now analyse the total variance afiesh. 

(In setting out tables like the follow ing tlie beginner finds it best 
to set the obtained figure first, the degrees of freedom next, and 
the calculated or textbook figures last, since that is the order of 
working. The experienced worker, however, will put the degrees 
of freedom first, since they really indicate the structure and 
fundamental conditions of the analysis.) 
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ANALYSIS OP variance: (WITH POUR COMPONENTS) 


Source of 


Sum of 

Degrees of 

Mean 

Variation 


Squaies 

Freedom 

Square 

Boys 


736 

4 - 1=3 

245-3 

Tests 


416 

4 - 1=3 

138.6 

Interaction 


230 

4 - 1=3 

76.6 

Error 


6 

16 — 10 = 6 

i.o 

Total 


1388 

16 — I = 15 

(92-5) 

The obscrv’cd and ex 

pected variance ratios may be tabulated as 

follows. The divisor is 

now I ‘O in every ciise. 




VARIANCE RATIOS 





Degrees of 


Source 

Ratio Obsetoed 

Freedom 

Expected 





5 % 1% 

Boys 

F* 

245-3 

3 and 6 

4.76 9.78 

Tests 

F, 

138.6 

3 and 6 

4.76 9.78 

Interaction 

F, 

76.6 

3 and 6 

CO 

6^ 

to 


Thus, when wc allo\\ for the fact that the tests are highly 
correlated, and thus confirm one another far more strongly than 
a random set of tests, the differences between boys, between tests, 
and between types of boy (or test) appear highly significant. 

Application to Factor Analysis 

It will now be seen that we have demonstrated the statistical 
significance of (i) the ‘general factor’ of average ability, and (ii) 
the ‘group factor’ of academic versus practical ability. Thus, 
provided the factor-measurements are obtained by simple 
averaging, we have found a convenient method for testing the 
significance of factors. 

(The high significance thus obtained with a sample consisting 
of 4 boys only may seem surprising. But the correlations are 
equally high. Thus, the observed correlation for Aiitlimetic and 
F.nglish is .gg and the residual correlation .92. N<Av with 4 items 
the i % level is -99 and the 5 % level -90. But wc have not one 
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correlation but 6 in each case, though not ail 6 residual correla- 
tions will be independent. Thus the rough test of significance 
applied to the correlations confirms the more precise test obtained 
by analysing variance. However, it should be remembered that 
the figures given in this example are purely artificial, chosen to 
simplify the mental arithmetic, rather than to illustrate the kind 
of figures actually obtained.) 

Interaction 

When planning a research which will involve the analysis of 
variance the ‘factors’ arc chosen not so much because they 
operate independently but because they can be controlled and 
measured. Thus it is necessary to devise methods of research 
wherein the joint effects of the varying factors may be compared 
with their isolated efieefs, and it is jrossible that the joint effect 
will not be the mere sum of the respective efi'ccts. We can adapt 
the methods given by Fisher in his Design of Experiments where the 
investigations concerned agriculture (manuring of fields, rotation 
of crops, etc.) to our educational problems. Much investigation 
remains to be done on suitable teaching methods for children of 
various ages and capacities and in various subjects. We might 
use (a) oral methods alone, (6) film strip, (c) cinema film, (</) prac- 
tical work and exercises, and (e) a combination of two or more of 
these methods. We might expect that combinations of the 
methods might be more effective than the use of a single method. 

In the analysis of variance w'hat is known as error is the com- 
bined effect of various influences w'hich cither cannot be or arc 
not controlled in the investigation. Certain precautions must be 
observed in order that we can estimate this error. With small 
sampling techniques it is necessary to secure the replication or 
repetition of individual items with similar factorial content. 
Where the ‘interactions’ are known or can be shown to be signi- 
ficant they may be used to measure error. Secondly, within the 
conditions imposed by the experimental design the items should 
be assigned at random. Randomisation may be secured by a 
mechanical method such as tossing coins, drawing cards or by 
using sets of random numbers. Fisher used the name ‘randomised 
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blocks’ for an experimental design which involved these principles. 
Eight blocks of land are selected and each is divided into five 
plots. Five varieties of a particular kind of crop, or five types of 
fertiliser are assigned at random to each plot. We could translate 
this into a research in education by testing the relative merits of 
five different methods of training. Such problems as the methods 
of teaching various processes in arithmetic, improving memoriza- 
tion or treating delinquents would be susceptible to such treat- 
ment. Obviously the children to be studied will differ according 
to home and school environment and thus the children used in 
the investigation are chosen from eight schools. Children of 
about the same age are picked at random from the schools and a 
different method of training is allotted at random to each indivi- 
dual. In analysing the results there will be only one criterion of 
classification — that according to training or treatment. 

But if the number of performances is large enough the number 
of ways in which they are classified or cross-classified may be 
increased from tw o to three or more. 

Example'. We wish to investigate the efficacy of four different 
training methods (e.g. the remedial tcacliing of backward spellers). 
Four boys are select^ and all four will be subjected to all the four 
methods. To obviate possible differences arising from the test 
words used in the experiment, all the words will have to be taught 
by all the methods. It is possible, es^en probable, that the order 
in which a boy is taught by the different methods may make some 
difference to the result. For instance, if he starts with a phonic 
method and goes on to a copying method, the latter might be 
helped by the former. Again, if he starts the week with a phonic 
method and goes on to the others on subsequent days this might 
affect the results. Thus, as far as can possibly be managed it is 
necessary so to arrange the order that, with one boy or another, 
each method follow's and precedes eveiy one of the others. 

The following arrangement, which meets these requirements, 
is known as the Latin Square as the Roman or Latin capitals 
A, B, C and D represent the four methods. When a further 
classification is necessary Greek letters are used and the 
arrangement is then known as a GraeahLatin Square. 
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ARRANGEMENT OF TEACHING METHODS IN A LATIN SQUARE 


Oidtr 

Torn 


Dick 

Harry 


Gconie 


1 

A 


B 

C 


D 


2 

B 


D 

A 


C 


3 

G 


A 

D 


B 


4 

D 


C 

B 


A 


We will now express the marks in the tests designed to examine 

the teaching methods 

. For convenience m : 

inaly 

sis these have 

been arranged 

in the form 

of deviations ft om the general 

mean. 


RESULTS OF 

TEACHING 




Test Material 

7 V>in 

Dick 

Harry 

George Total 

Average Square 

j 

26 

15 

- 3 

— JO 

28 

7 

49 

Ji 

22 

5 

— 1 

— 18 

8 

2 

4 

111 

— 10 

I 

— 9 

2 — 

16 

— 4 

16 

IV 

— 2 

— 9 

-7 

— 2 — 

20 

— S 

2S 

Total 

36 

12 

— 20 

- 28 

0 

0 

94 

Avcmrc 

9 

3 

- 5 

- 7 

0 


Square 

81 

9 

25 

49 

164 




To calculate the averages for each training method \vc rearrange 
the figures in each column as follows: 


Teaching Method 

7 om 

Dick 

Harry 

George 

Total 

Average 

Square 

A 

26 

I 

— 1 

— 2 

Z 4 

6 

36 

B 

22 

'5 

— 7 

2 

3 Z 

8 

64 

C 

— 10 

— 9 

- 3 

— 18 

- 40 

— to 

100 

D 

— 2 

5 

— 9 

— 10 

— 16 

- 4 

16 

Total 

3ft 

12 

— 20 

- 28 

0 

0 

ai6 


From each figure in the last table but one we now subtract the 
sum of the appropriate averages for (i) the boy, (ii) the test 
material, and (iii) the teaching method. We obtain the following 
residuals: 


RESIDUALS AND THEIR SQUARES 

Tat 


Mutenal 

Tom 

Dick Harry George 

Total 

Tom 

Dick 

Harry 

George 

Total 

i 

4 

- 3 5 

- 6 

0 

16 

9 

*5 

36 

86 

u 

3 

4-4 

— 3 

0 

9 

16 

16 

9 

5 ® 

111 

- 5 

-4 4 

5 

0 

»S 

16 

16 

*5 

82 

IV 

— 2 

3 - S 

4 

0 

4 

9 

as 

16 

54 

Total 

0 

0 0 

0 

0 

54 

50 

82 

86 

27a 
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The sums of the squares are tabulated below. In entering those 
for each of the means we have multiplied the squares from a 
single column or row by the number of columns or rows (in this 
case 4), since the means are repeated in each column and in 
each row. 


ANALYSIS OF VARIANCE (LATIN SQUARE) 


Source of 

Degrees of 

Sum of 

Mean 

Varialion 

Freedom 

Squares 

Square 

Boys 

3 

656 

2 i 8.6 

Test Material 

3 

376 

125-3 

Teaching Methods 

3 

864 

288-0 

Residuals 

6 

272 

45-3 

Total 

15 

2168 



VARIANCE RATIOS, OBSERVED AND EXPECTED 

Degiees of 


Source 

Observed 

Freedom 

Expected 

- 0 / , 0 / 

5/0 1/0 

Boys 

2i8-6 

— . = 4.82 
45-3 

3 and 6 

4.76 

9-78 

Test Material 

= 2.76 

4.5-3 

3 and 6 

4-76 

9-78 

Teaching Methods 

288-0 _ 

=»-35 

45-3 

3 and 6 

4-76 

9-78 


The differences in the effects of teaching arc fully significant 
but those for the boys are only just over the borderline. There is 
no discernible difference in the different t>'pes of teaching 
material. 

With a more elaborate experiment we could study the inter- 
actions, that is, the differences in effect of teaching methods on 
particular types of pupil or test material. It has been assumed in 
the above example that the ‘interaction! can be taken as a measure 
of error for the main effects. 
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Methods of Working 

In actual practice it will involve considerable labour to work 
with the actual deviations, for the means will usually involve 
decimal fractions. The follosving procedure will make the 
arithmetical work simple and mechanical. It will be illustrated 
from the problem on page 144 involving three criteria. Here are 
the steps of the process: 

I . Find the totals of the rows and the columns, and the grand 
total. 


2. Divide the totals by the number in the corresponding row, 
column or table. 

3. Multiply each total by the corresponding mean. 

This may be done by a calculating machine, but if one is not 
available, .square the means, and multiply by the number of 
items on which each mean is based. (The result is obviously the 
same, but the 'total x mean’ method avoids any mistakes in 
multiplying the .squares, when the number of rows differs from 
the number of columns.) 

4. Add the products. 

5. With the Latin Square rearrange the rows and find the 
‘totals X means’ as before. 

6. Square each figure in the first table and find the grand total 
of the squares. 

7. From each of the four totals thus obtained, subtract the 
product of the grand total by the grand mean. The results are 
the square-sums for the various means and the total square-sum. 

8. To find the square-sum for the residuals, subtract the sum 
of the three square-sums for the means from the total square-sum. 
The final result can be checked by directly calculating ^e squares 
for the residuals, at least approximately. 



ANALYSIS OF VARIANCE 


147 


WORKING METHOD. STEPS I, 11, III AND IV 


Test 








Material 

Tom 

Dick 

Harry 

George 

Total 

Mean 

Product 

j 

A 46 

O35 

C 17 

D 10 

208 

27 

2916 

11 

B 42 

D25 

A 19 

C 2 

88 

22 

1936 

in 

C 10 

A 21 

D II 

B 22 

64 

16 

1024 

IV 

D 18 

C II 

B13 

A 18 

60 

»S 

900 

Total 

116 

mm 

60 

52 

320 

80 

6776 

Mean 

29 

^3 

•5 

13 

80 

20 


Product 

3364 

2116 

900 

676 

7056 


6400 




STEP 

V 




A 

46 

21 

19 

iK 

104 

26 

2704 

B 

42 

35 

13 

22 

112 

28 

3136 

C 

10 

1 1 

17 

2 

40 

10 

400 

D 

i 3 

25 

tf 

10 

64 

16 

1024 

Total 







7264 




STEP 

VI 




1 

2116 

441 

361 

324 

324* 



11 

1764 

1225 

169 

484 

3642 



til 

100 

121 

289 

4 

514 



tv 

324 

6*5 

121 

xoo 

1170 



Total 

4304 

2412 

940 

912 

8568 




STEP VII 


Crude Correction 

Sgume Sum Term 

Boys 7056 — 6400 

Test-Material 6776 — 6400 

Teaching Methods 7*h4 — 6400 

Total S568 — 6400 


656 

376 

864 

zi68 


STEP vni 


Square Sum for Residuals ai68 ~ (656 + 376 -f 864) - 27a 


Such comparatively simple analysis may lead to more elaborate 
experimental designs such as those in which there may be tw'o or 
three criteria of classification, one or two essential interactions and 
several items instead of only one in each sub-class. ‘ The technique 

^ See Sir CyrU Burt’s report on ‘Teaching Backward Readers’, British Journal 0} 
Educational PsythoU^, XVI. 
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may also be extended to the testing of simple and multiple 
regressions and their linearity. This is gi\'cn in Mather, Chapters 
VIII and IX. It may also be applied to intra-class correlation (see 
Fisher, Statistical Methods) and to the analysis of covariance. The 
latter is necessary where the criteiia of classification may be not 
independent but correlated. Suppose it is necessary to test 
alleged differences in educational attainments between children 
in various parts or towns of a county at a transfer examination. 
It may be that the age composition may \'nry from one part to 
another. Recession must dicn be used to eliminate die effects 
of differing age. This is best done by analysing the covariance 
as well as the variance. The method is given in Snedecor, 
Chapter VIII. 

The works of Fisher, Snedecor, Yule and Tippett mentioned 
in the bibliography may be consulted for more advanced work on 
the analysis of variance. 
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GRAPHS AND GRAPHICAL METHODS. 
THE DIFFERENTIAL CALCULUS AND 
TRIGONOMETRICAL FUNCTIONS 

G raphicai. methods of expression will prove very helpful in 
simple statistical investigations. In fact, for those who have 
only the slightest knowledge of mathematics they will often 
prove to be the only means of dealing with the results of an 
invcstig.ition, lists of scores and so on. Even where the investigator 
is well equipped mathematically graphical method still lemains as 
the best means of iccordmg and intei preting results, in many 
cases. 

Graphs make an immediate appeal to the eye. Even where 
there is little ‘aptitude for figures’ the visual image is the one 
above all others, which can be most easily remembered, analysed 
and interpreted. 

Graphs give a picture of the variation of one quantity with 
another, and properly interpreted the graph will provide a clue 
to the extent and nature of this variation. 

Unless the investigator knows sometiring of the calculus, qf 
evponentials, etc., the graph is often the only means of representing 
the variation. Finding the areas enclosed by graplis is an easy 
way of ‘integrating'; tangents drawn to points on curved graphs 
anticipate the process of ‘differentiating’. Maximum and mini- 
mum points are easily seen and interpreted. With a graph, 
interpolation is possible, that is, intermediate values between the 
plotted points may be found. A curve or line may be extended by 
having regard to its general shape and hence finding further 
values which are outside the range of the points that are plotted. 
This is know'n as extrapolation. The processes of interpolation 
and extrapolation are not to be undertaken lightly. In the former 
case intermediate values should be found by experiment and 
observation particularly where a curve turns sharply. In the 
latter case the continuation of a line is a very risky procedure for 


>49 



150 APPENDICES 

factors may come into play which alter the general trend and in 
psychological investigations these ‘tails’ may have considerable 
significance. Inteipolation and extiapolation should be applied 
on the merits of each case and tlien with care and reticence. 

A point xji may be fixed on a plane surface by referring it to 
two axes. It is convenient to draw these as straight lines at 
right angles. If the horizontal and vertical axes divide the graph 
paper into four equal parts w'e can provide for an equal number of 
X and negative x values and of_y and negative_y values. If we are 
only concerned with positive values of x and^ it will suffice to 
draw the axes respectively at the bottom and at the left side of 
the paper. Distances are mcasuted from the origin whiclt is the 
point 0 where the axes inteiscct, and it is conventional to regard 
values measured to the right and upwards as positive and those 
to the left and downwards as negative. To plot a point it is 
necessary to measure along the x axis a distance x and upwards 
a distance^'. It is necessary to consider carefully what scales can 
be employed for both x and j values, in other words, how many 
units of X and^ arc represented by a division on the graph paper. 

If a straight line is drawn on the graph paper it will contain 
a series of points which represent values of x and^ which are 
related together in a simple way. x andy arc connected together 




GRAPHS AND GRAPHICAL METHODS 15J 

in terms of a simple equation, appropriately called a linear 
equation. The value ofy is dependent on that of is known as the 
dependent variable, and x the independent variable, y becomes a 
function of x and is sometimes written j = fix). 

Let us first consider a straight line drawn through the origin 0 
and at an angle Q (theta) with the axis of x(ox). 

Consider any point P on the line. 

Its co-ordinates, that is its * andj> values, are related together by 
y 

- = tan 6 ory = x tan 6 

The slope of the line can thus be thought of as the tangent of the 
angle which the line makes with the axis of The equation of 
this line has already been given: it isj = x tan 6 and this connects 
all the X and j values on the line. 


Y 



When the line does not go through the origin but meets the 
axis ofy at the point I cutting off a piece oc (c) on it, it will readily 
be seen that the equation of the straight line isy = x tan 6 + r 
for every value corresponding to an x in the previous equation 
of the line through the origin will have to be increased by the 
intercept c on the axis ofy. 


» See page IS7- 
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Any equation wliich can be put in the form lx my -'r n = 
where I, m and n arc independent of x and can be represented o 
a graph as a straight line. 

I 

In tliis case, the slope of the line 

m 

and the intercept on the axis of^ = — ^ 

A linear relationship is s.iid to exist between two sets of measures 
if a straight-line graph is yielded when points representing 
corresponding sets of values are plotted and joined. 

The u.se of .sti night-line or other graphs as ready reckoners, 
conversion tables, etc , needs no stressing. 

A feiv words should be said about regression lines. The line 
y = rx gi\cs the regression of v on x and x ~ ry gives the i egression 
of X on y. Where r - i (pet feet correlation) the line_y — x goes 
through the origin and makes an angle of 45“ w ith both axes. 
The older school of statisticians would say tltat when correlation 
was perfect there w as no regression, but some writeis make r the 
correlation coefficient (and slope of the regression line) a direct 
measure of the regression. From the context it is usually easy to 
sec what a w'rilcr intends to convey. Regression gives us a measure 
of the reliability of predicting the value of a measure by reference 
to that of anolhci with which it is correlated to a greater or 
lesser degree. 

The c.ilculus is best approached by considering the graphs of 
curves. We may look upon difl'erentiation as a process of measur- 
ing rate of change, curvature, etc., and integration as one of 
summation, the detcimination of areas, etc. Differentiation and 
integration may be regarded as one the reverse of the other. As 
these processes involve conceptions relating to infinity and 
infinitesimals care must be taken to see that these ideas are not 
given the form of absolute numbers. 

Suppose the curved line represents a function f{x) of x. Its 
equation is = f{x). Consider a point on the line P, whose 
co-ordinates are x and y. Further, take another point P, near to 
it with co-ordinates slightly larger x -f- Sx and y -1- 6x, where 6x 



GRAPHS AND GRAPHICAL METHODS 153 



and Sji (delta x and deltas) are small increments in the value of 
X andj respectively. 

Now consider the small tiiangle P,MPi with vertical side ^ 
and base 6a:. Its hypotenuse PiPj will approximate to a portion 
of the curve as ^ and 6x become smaller. 

P,P, will be a tiny part of a tangent to the curve as P, and Pt 
approach one another. 

5 y 

The slope of this tangent = 


J- =/w 

.-. J , + =/{a: + 6x) 

=-./(a: + 5 x) -fix) 
_ /(* + Bx) -fix) 
bx bx 


It is necessary to utter a w'ord of warning that the rigorous treat- 
ment of the calculus must be regarded as being beyond the scope 

of this short statement. ^ ® brue quotient obtained by dividing 

small but finite quantities by and 6x but when we proceed to the 

dy 

limit and obtain the differential coefficient this must not be 

dx 

regarded as a fraction but as an operator acting on y. The 

differential coefficient of a function is spoken of as its first deriva- 
tive and is represented by/i(*). 


L 
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A simple example will show this method in use 

Suppose = X* 

y + 6)' = (x -f Sac)> 

_)i + ^ = jr> + 2 xSx -i 5x’ 

6v = X* + 2 x5x j 6a* — X* 
= 2A 6a f 6 a’ 

6 _)' 

• - = 2X + 8x 

6x 


Now making ^ and 6x smaller and smaller 
dx 


= 2X 


(This is where this method, though simple, lacks rigour, for we 

6>' dv 

assume that 6x vanished but that ' becomes . The above 

6x dx 

method might be regarded as a useful demonstration rather than 
a proof.) 

To find the differential toeflicient or the deris-ative for x" we 
need to keep in mind the binomial expansion for (x - a}" 


(x -i- a)" — x" 4- Bx”' 'a + 




I X 2 


+ .■ 

1x2x3 

7 = x" 

y -i- 6j- = (x + 5x)- 

= x" + nx" ■ ‘ 5x + x” * 6x' i . . . 

I X 2 


6^ = nx“ ' 6x + 


”(”-0 
I X 2 


6x» + , . . 


6x 


= «x 


.H-* 


+ 


I X 2 


5x + . . . 


term containing higher power of Sx. 
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(* + a)” = X" H- nx " ' ' a | x”'" a* + 

1x2 

n (n - i) (n — 2) , , 

- j;"' a* .... a” 

1x2x3 

dx** 

To find ■ 
dx 

y = :id' 

11 + 8y = x " + nx ” - ' 5 * + a:""' S«' 1- . . . . 

1x2 

6y = 5 x H- ~ -f . - • ■ 

1 X 2 


by 

6 jc 


= nx 


» 


n{n- 1) , 
-I ’ X' 

I X 2 


■6a I 

tcims containing higher powers of B.v. 


Piocecding to the limit 
dy 

^ = ny ' as tei ins containing 6a and its pow crs vanish in the limit. 

As the differential coefficient gives a measure of the slope of the 
curve it will be equal to 0 where the curv'e has no slope, that is to 
say at the points of the curve where tlic tangents are horizontal. 

Thus, we find values of * wliich correspond to maximum or 
minimum values of the function by equating the differential 
coefficient to zcio and sohing die equation. 

This method will not distinguish betsveen maximum and mini- 
mum values but it can readily be seen that, as we trace out a curve, 
a tangent to the moving point \\ ill turn in a clockwise direction 
as we approach and pass a maximum value and it will turn in an 
anticlockwise diicction as we approach and pass a minimum. 

Thus, a further process of differentiation (double differentiation) 
will give us a clue to the recognition ol maxima and minima. 


dy 

If the second differential coefficient has a positive value the 


point concerned will be a minimum and if it has a negative value 
the point will be a maximum. 
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Differentiation 

= (n+ Ox- 
ax 

d , I 

— log,x = - 
dx X 


cos X = — sm X 


— Mn X = cos X 
dx 


-- tan X = sec* x 
dx 

d 

— cot X = — cosec* X 
dx 


Integration 

r X-+’ 

x-e& = 

J « + I 

J sin X rfx = — cos X 
J cos X dx = sin x 
J sec*x dx = tan x 
J cosec*x dx = ■— cot x 


A constant which can be dctcr- 
^ should be regarded as mined from the practical nature of 

an operator and not as a the problem and the given data has 
fraction added in each case. This is 

obvious when it is remembered that 
integration is the reverse of differen- 
tiation and that the differential co- 
efheient of a constant is zero. 

Trigonometrical Functions of an Angle 
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Consider the right-angled triangle POX with angle POX = 6*, 
PX (perpendicular) = />, OX (base) = b, OP (hypotenuse) = h. 


sine e p 

(sin) h 

cosine 9 _b 

(cos) h 

tangent 6 p 

(tan) b 


cotangent 9 b 

(cot) “ p 
secant 6 h 

(sec) b 

cosecant 9 h 

(cosec) p 


It will readily be seen using the properties of a right-angled 
triangle that each of these functions may bec:ilculated by knowing 
any one of the others. The following relationships are most 
important 


sin 9 . 

tan 9 = , sm’ 0 -|- cos* 6=1. 

cos 6 

I I I 

cot 0 = , sec 0 , cosec 0 = -7 — . 

tan 6 cos 0 sin 9 

cos (90° — 0) = sin 0 sin (90® — O) = cos 0. 


The angle 0 must not be regarded as an angle limited to less than 
a right angle. A triangle of reference POX may be drawn by 
dropping a pcrpendicuLu' PX firom a point P on the line OP 
generating the angle on to the axis of X, — X.OX. Although the 
tables only give angles between 0° and 90° the trigonometrical 
functions for other angles may be calculated by arranging them 
as (180° — 0), ( 180° + 0), (360° — 0) where 0 is an angle less than 
90® which can be found from the tables. The following diagram 
shows when it is necessary to change the sign of the function 
found in tlie tables. Angles are measured in an anticlockwise 
direction and the complete round of angles (360°) is divided into 
four quadrants 


(180° — ) sine + 
cosec -t- 


lAll -i- 


(i8o° -t- ) tan + 

I'COt + 


(360® — ) cosine + 
sec -h 
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or in the mnemonic form by using the word CAST : 
It may be useful to remember that: 


S |_A 
T"| G 


sin 0® = 0 

cos 0° = I 

tan 0° = 0 

sin 30° = i 

0 V3 
cos 30 = ^ - 

tan 30° = 

sin 45® = 

I 

cos 45“ = - , 
\/2 

tan ,1.5' = I 

sin 60® = “ 

2 

cos 60" = i 

tan 60° — 

sin 90° = I 

cos 90® -- 0 

tan qo° = sc (infinity) 


Sometimes mental tests, mental ‘factors’, etc., are represented 
as vectors, that is, straight lines at an angle to one another. The 
correlation coefficient between the quantities represented by any 
two lines is given by the cosine of the angle between them. The 
projection of one line upon another is equal to the length of the 
first line multiplied by the cosine of the angle between the lines. 
(Do not confuse this wtli regression and remember that tlxc ‘slope’ 
of a line is given by the tangent of the angle which it makes with an 
axis of reference.) 

Factors, etc., represented by vectors at right angles are obviously 
uncorrclated (cos 90° = o) and they arc said to be orthogonal. 

Factors, etc., represented by vectors which are not at right 
angles contain some measure of correspondence (the cosine of the 
angle between them is not zero). These are said to be oblique 
factors. 

This useful idea can be extended from two dimensions to three 
(and analytically without trying to conceive models to 4 or more. 
The geometry of hyperspace can be used for dealing with more 
than 3 factors which are represented by vectors). Thrtt ortho* 
gonal factors can be thought of as lying along tlie edges of a 
rectangular box and meeting at one of its corners. A number of 
oblique factors could be drawn as lines in space radiating from a 
point. If an arbitrary line were taken to represent the first factor 
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the other lines could be imagined to fit into their relative positions 
by taking the correlation coeificient between each pair, finding 
the angle of which it is the cosine and fitting in the line accord- 
ingly. With three lines this involves a simple principle of solid 
geometry but with four or more analytical methods using algebra 
and trigonometry may have to suffice. Angles are not always 
given in degrees, and it is often more convenient to think of them 
in radian measure. 

an radians = 360“ 

•rr radians = 180° 

180“ 

I radian = 

IT 

When the symbol n appears in formulae used in psychological 
and educational statistics it usually refers to an angle of two right 
angles or i8o“. 
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THE USE OF THE SLIDE-RULE’ 

T he slide-rule, which dates from about the same period as 
that of the invention of logarithms, is really a simple instru- 
ment working on logarithmic principles. To multiply two 
numbers we add their logarithms. If, therefore, we have two 
scales whose distances and divisions arc measured out in the 
lengths of the logarithms which they represent it is easy to see that 
numbers may be multiplied by adding these logarithmic lengths 
by means of two scales one of which is capable of sliding against 
another. Division may be performed by subtracting these logarith- 
mic lengths, squaring by doubling and finding a square root by 
halving and so on. In our work the slide-rule is particularly useful 
when each of a set of numbers has to be multiplied (or divided) 
by a factor, as for instance in reducing a .set of marks from one 
maximum to another. One setting of the rule is all tliat is required 
and the reduced marks may be read off directly from the rule. 

Although most work in educational and psychologictd statistics 
docs not call for the full resources of the instrument such as is 
used by engineers, it is worth while to acquire a good one, which 
will cost from 30S. to The beginner need not feel overwhelmed 

by the amount of metrical material compressed into one scale. If 
any difficulty arises it w'Dl suffice to make a simple slide-rule by 
gumming two strips of logarithmic graph paper to two ruler-like 
pieces of wood respectively which can be made to slide against 
one another and may be kept together by a couple of small clastic 
bands. No difficulty is expected, however. 

Finding ^fumbers 

The front face of the ordinary lo-inch slide-rule consists of two 
pairs of scales; the upper ones usually are called the A and B scales 
and the lower pair are known as the C and D scales. 

* See also the section on Logarithms in Tht Teathtng of Arkhnetie and 
Slemtntary Mathematia, by the author. 
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Any number of whatever reasonable magnitude can be located 
on the slide-rule, because the first mark can be called i, lo or loo 
as required. The sub-division of units sometimes gives difficulty 
at first but since there are only three different variations to learn 
these should be mastered at the outset. 

If we call the first mark on the A scale lo, the number 1 1 is to 
be found five graduations (division marks) further along, the 
space between lo and 1 1 is disidcd into five parts, with graduation 
marks at 10*2, 10-3, 10-6, 10-8 leaving any smaller divisions to be 
estimated as required. This method of marking continues until 
20 is reached, after which the spaces between the whole numbers 
are not large enough to allow five divisions, so from thence on- 
wards the units are only cut in half. From 50 to 100 there is not 
even room for this to be done and the units are no longer sub- 
divided. 

On the D scale there is more room as ‘smaller’ numbers arc 
involved. If the beginning is called 10, the number 1 1 is found 
ten marks further along, the intermediate values being lo-i, 10.2, 
etc., to 10-9 and this s)tcm is continued up to 20 From 20 to 40 
the units have five divisions each, e.g., 20-2, 20.4, ao.6, 20.8, after 
which there is only sufficient room for half divisions to be shown. 

If a lo-inch slide-rule is examined carefully so that these facts 
arc appreciated facility in finding .and reading numbers will soon 
follow. 

It is always worth while to perform rough mental calculations 
of the answer as this will help to find the correct place for the 
decimal point. 


1 . MulHplieahon 

Example'. 14.6 x 3-2 (approiumate value 50). Put B. i (the 
beginning of the B scale) against one of the numbers on the A 
scale. Locate the second number on the B scale and read off the 
product ^m the A scale immediately above the B scale number. 
The fine vertical line of the transparent window of the sliding 
cursor may help in reading a number on one scale which is 
exactly in line with a numbw on the other. 
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In effect, in this process of multiplication a piece of the A scale 
has been added to a piece of the B scale and, as the numbers are 
multiplied together by adding their logaritlunic lengths, the total 
length indicates the products of the two numbers. 

2. Division 

Example: 43.6 -- 19-8 (estimated approximate value 2). 

Place the divisor 19-8 on B scale immediately under the divi- 
dend 43-6 on the A scale The quotient may be read off on the 
A scale immediately above B 1 . In division a piece of Scale B is 
subtracted from a piece of Scale A To divide two numbers we 
subtract their logarithms. 

Both multiplication and division can be performed on the C and 
D scales. The results can usually be estimated to a greater degree 
of accuracy owing to the largei divisions, but working is generally 
a little slower than with the A and B scales. 


3. Conversion and Reduction 


These processes are equivalent to multiplying or dividing the 
given number by a certain factor. It will be seen that division by 
a number is equivalent to multiplication by the reciprocal of the 
number, e.g. division by 12 is equivalent to multiplication by 
or -0833. Each case must be considered on its merits, that is, 
whether it is easier to multiply by a factor or divide by its recipro- 
cal. Example: To convert marks given with a maximum score of 
80 to a maximum of 100. This is equivalent to multiplying each 

mark by or 1.25. For ease of working it is better to put Bi 


opposite to I -25 on the A scale and read off the result on the A 
scale immediately above the given number on the B scale. After 
the initial setting no further movement of the scale will be required 
for the whole set of marks. 

The conversion of marks from a maximum of 100 to lagi^of lb 
need not be regarded as a division but rather as a multfaifeat^n 
by the factor *8. 
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USE OF THE SLIDE-RULE 
Squaring Numbers 

Find the number on the D scale. Its square lies immediately 
above it on the A scale. Use the cursor. The scales all remain at 
‘zero’ position. 

Finding Square Roots 

Find the number on the A scale. Its square root lies immediately 
below it on the D scale. No\v any number which is given in figures 
without a decimal point smII appear to have a choice of one of 
two square roots (quite .ipait fiom neg.'tive roots), c.g. the square 
root of 4 •() is 2-0 but that of 40 is 6.3 Thus there arc two positions 
for any number on the A scale, and the correct one must be 
chosen with reference to the size of the given number according 
to the following rule. For numbers with an odd characteristic use 
the right-hand p.irt of the A scale. For numbers wdth an even 
characteristic use the left-hand part. The characteristic is one 
less than the number of digits to the left of the decimal point, and 
if negative is one more than the number of noughts immediately to 
the right of the decimal point, e g. 


3«<>7 

characteristic 

3 

odd 

316.7 

characteiistic 

2 

even 

9.6 

characteristic 

0 

even 

.3076 

characteristic 

— 1 

odd 

.0003001 

characteristic 

- 4 

even 


In using tables of square roots the same principle applies, but it is 
usually sufficient to make a rough mental estimate ot the required 
value and this will determine which of the tw'o given numbers is 
required. 
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PASCAL'S TRIANGLE AND THE NORMAL 
CURVE OF DISTRIBUTION 

S uppose that wc toss a penny a large nuniber of times. In the 
long run heads and tails will be hbout equally divided and 
the distribution will be in the proportion 
H T 

I I 

If we toss two pennies there will be three possibilities: two heads, 
tw'O tails, one head, one tail, in the proportion:* 

HH HT HT TT 
' , ' 

I 2 I 

With three pennies there will be four possibilities: three heads, 
three tails, one head two tails, one tail two heads in the proportion 
HHH HHT HTT TTT 
I 3 3 » 

and so on. Although we do not find these proportions strictly 
observed unless wc take inconveniently or impossibly large num- 
bers of cases these figures represent the probabilities of the 
distributions of each particular showing of heads and tails. 

This at once suggests to us that it may be useful to consider the 
numbers arising when we continue to multiply 1 1 by itself, that is, 
the powers of 1 1 


(II)‘ 

II 

(II)* 

I2I 

(II)* . 

»33i 

(II)‘ 

14641 


* Student* of biolofty will note that these are the proportions of ofiaprina showing 
distinct transmissible characteristics in the simplest application of Menders hnos, 
e.g. in the second generation of p*M in die crossing of long and short pM, pure 
long pm. inqnue long peat and pure short peas were in the p i o p orwm i a. i 
respectively. 

*64 
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In building up Pascal’s triangle we must continue the powers of 
II without carrying additions above 10 into a higher column. 
Those who are famihar with the binomial theorem will see that the 
above continuous multiplication by 1 1 g[ivcs the coefficients in the 
binomial expansion ( i + x)" of the ascending powers of x. 

Thus (i+x)‘ = i4-4jc-|- 6x* -f 4.** + x* by the expansion of 
(ii)‘ or I 4 6 4 I. 

If it can be imagined that we continue Pascal’s triangle to the 
limit making the number of the power n sufficiently large we should 
arrive at the exponential curve known as the probability curve or 
the curve of error. If instead of thinking of the smooth curve 
which is reached in the limit, let us imagine the histogram 
given at the bottom of this page. 

1 

I I 
I 2 I 

1331 

14641 
15 10 10 5 X 

I 6 15 20 15 6 I 
Pascal’s triangle 

It will readily be seen that the area of the whole figure repre- 
sents the total number of cases i.e. i 

the height of any column the fi-equcncy for each distribution of 
heads and tails and the distance from the centre point of the hori- 
zontal line (the x distance) the degree of departure from the central 
or most common tendency (the mode), in this case, two heads and 



1 

1 

6 




4 

4' 


1 

I 


KUHH HHKT HHTT HITT TTTT 
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two tails. The chance that a single throw of four coins will give a 
particular number of heads and tails is given by the area of the 
column concerned compared with that of the whole figure., e.g. 
the chance of throwing four heads (or four tails) is i in i6. 

If we now return to consider the histogram ‘smoothed out’ and 
its area representing a large number of cases, it is ca.sy to appre- 
ciate that the probability that a measure (x) will lie at a certain 
distance from the central point is given by the ratios of the area 
of the tail of the curve beyond tliat point and die area of the 
remainder of the cui-\'e cut off by an ordinate through the point. 
In some cases (and these should be obvious) it will only be 
necessary to consider one half of die curse, that is, one or other of 
the hah’cs on either side of the central line. 

Some Properties of the Normal Curve of Distribution 

Tliis curve is also spoken of as the curve of eiror, the Gaussian 
cune or the curie of piobability for reasons which we have 
already mentioned. 

The curve is a member of the family of exponential curves, that 
is, it is related to the growth function e. The exponential function 

de^ 

has a rate of growth equal to itself i.e. y- = e^. 
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The cu^c may be defined as a frequency curve whose height 
at any point is inversely proportional to the antilogarithm of half 
the square of tlie distance, measured in terms of the standard 
deviation as the unit, of thiit point fiom the mean. 

The formula for the curve is y = e 2o» where * and y are 
points on the curve with icspcct to o the central point on the x 
axis .ind,v» is the ‘height’ ol the curve at its central point, that is, 
the dist^incc wliich it tuts off nlong the^' axis*. 

CT is the standard deviation. 

If this i,s large the cm ve is flat at the top and is said to be platykurtic. 
If this IS small the curve is shaip and pointed and is said to be 
leplokurtic. 

Tlic degree of cuivaiui c is spoken of as kurtosis. 

For oui purpose wc must icgard_y as a frequency of a score x 
which is refen ed to the average as zero. 

We will diflerentwle the function representing thecurve of normal 
distribution, wtmen as. 

\ _ ^ 

)■ “ - e 2c» 

where .\ is the luimbe-i of cases in the distribution and o is the 
standard deviation. 

T 

L.ct us write - c i\ constant 

CTv^ir 

y = c e ~ 2 » 


dy 

dx 


<= l/ (— a’ 2ff’) 
C f :<n • 

ax 



cx — cx 



If we substitute i: — o in this derived function (first differential 
coefficient) it vanishes. 

Thus, this represrnts a value xckere the curve is at a maximum or mini- 
frmm value. It is easy to sec that dies is actually a maximum. 

Let T» try to find otlier points where the curve has a maximum 
or mitffmum value, i.e. where it is hoiizontal. 


a. 
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Equate the first derived function to zero. 
— cx 

= o 

a* «2o» 


Divide by 



I 

= o 

^ 20» 


Taking logs. 


*• 

e-jat = ec 


log, (< ioi) 


= o: 


Now log, e = I 


••• = 
2a’ 


oc 


a;* 

— - = ce 

20 ’ 



Thus, the curve is horizontal at infinite distances from the 
central line. (It is necessary to give a word of warning about the 
above demonstration. We have used ‘infinity’ as though it were a 
number and this may lead to absurdities. The above is not a 
rigorous demonstration and it is wise to warn the student against 
using ‘plus and minus infinity’. Here we have unfortunately 
had to sacrifice rigour for the sake of a simple demonstration.) 

Students who have proceeded a little further with the calculus 
than we have done here will be able to continue and find the 
second derivative or differential coefficient of the function of the 
curve of normal distribution. 


dy 

dx 





d*y 

dx* 


= (^r) « 


f(x’ - O') 

^ 

N (x» - a*) 

t aof 

yatr ^ 
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It will be observed that at symmetrical points of the curve there 
are points of inflexion, that is, the convex curvature of the top 
part of the curve gives way to the concave lower portions on each 
side. The rate of curvature will obviously be zero at these points. 
We can find tliem by equating the second derivative of the function 
to zero: 

c(** — ct‘) -2^ 

e 2ot = 0 

c‘ 

c 

Dividing through by — < lat (x* — o*) = o 

x* = ff* X = i a 

Thus the points of inflexion are at a distance a from the central point. 
Let us consider the curve drawn on such a scale that its area is 
unity. The total number of cases X given by tlie area of the curve 
will be i eprescnte<l by unit area. 

At the centre point or oiigin where x = o the equation of the 
curve becomes 

l o I 
yarra 

Thus — -r-- is the height of the curve at its maximum (its 
y'sTra 

modal ordinate) or the intercept cut off" by the curve on the axis 
of y. 

I 

The area of the rurve - - e so' can be found by integra- 

tion. The curve must be tliought of as extending from an infinite 
distance to the left of the centre jroint to an infinite distance to 
the right. 

The total area is given by 

f+* I ^ 

J _PC 'sfuTset 

which is equal to i. 

[If this exponential curve could be considered as a development 
Grom the expansion of the binomial (j + J)“ the sum of all the 
ordinates is i for + })" =* i" = 1.] 

M 



170 APPENDICES 

The expression representing the norm.il curve may be written 

I 


where ^ jot 

From statistical tables we may find values of z* for various 
values of — 

Ox 

If die curve has unit area and unit standard deviation^ — z 


■ATidy = ^ = e 

yarr 

If N is the area of the cur\'c the equation of the curve of normal 
distribution is 

N 

^ — e 

ay 5nr 

* 

It is often necessary to find the area of a curve which lies 
between the central line and a vertical line at a distance from the 
origin, or the area of the ‘tail’ of the curve beyond a given value 
of X. Tables are provided of the values of such areas in Chapter V. 
These are usually denoted by q. It will be seen that the sum of 
these two areas is equal to the total area of the curve on one or 
other side of die central line. The value of thc.se areas may be 
found from statistical tables or in any particular case by inte- 
grating the formula for the curve between limits, e.g. the area of 
the tail of the curve beyond a point x, on one side of the curve is 
given by 


•• 


+« 




t dX 


JXt 

*Thi* value of z h not to be confused with Fisher's a* which is the Imerboiit^ 
arc tangent of r the correlation coefficient, i.e. >- tanh * V. This tranuoimatwia 
gives more reliable results than r under certain conditions. 
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THE SPEARMAN RANKS FORMULA FOR 
CORRELATION 


P “ I 


bl</' 


NTN'- i) 

If tr' is the squai e c>( the standard deviation of a set of n ranks 


a* = 


(l» V 2> t 


V 


«’) 


i.c. a’ = 


lri‘ 


6 ‘ 


-c ' 


2 4 3 + 4 + • ■ • n\ * 


■”)' 


By adding the ideniitics • i)’ — n>— 3/1* ' 3n + i 

A’ - («- i)» - 3,n- i)> , 3(«- 1 ) + I 
(,« — I )» — - 2)» _ 3('n — 2)* + si's — i) + I 


2’ _ i» — 3.1* + 3.1 + I 
(« t t)* — I* = 3Z;(' 1 - 3Zn + n 

Jji is die sum of the first n n.itui.d numbers, i.c. half the sum 
of the first and last number multiplied b> the number of terms. 

2 

Substituting in tlie identity n* I 3n’ r 3n = 32i«* + 4 n 

- (» I- i) 

3Z«‘ = n* + 3n* 4- 3« — 3« - — « 

6Z«‘ = 2«* 4- 6 n‘ -i- 6 n — ^n{n -f- i) — 2« 

6 Zn* = 2R* -f- 3ri‘ 4 « 

= n(2n 4 i)(n 4 - i) 

I;j. _ + 0(” -t l] 

6 
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Substituting in our variance formula 
In* /In' 

. n 


a* = — 
n 

n{ 2 n 

at 


6n 


we have 

n*{n + i)* 


4n* 


n* — I 


12 

Now if p is the correlation coefficient between pairs of scores 
assuming that the variabilities and the means of the two sets of 
ranks are equal 

Id* 

^ 2 na* 

, . . aid* 

which gives P = i > - • 

® «(n*— ij 

by substituting for o* 

This can also be demonstrated in a simpler way: 

It would appear from the following idcntidcs: 

I* + 3* = i X 4 X (4‘- i) 

2' + 4« =iX5x(5*-i) 

I* -f 3* + 5* = i X 6 X (6* — i) 

2 * + 4 * + 6 * = J X 7 X ( 7 *- i) 

that the sum of the squares of consecutive odd numbers or consecu- 
tive even numbers beginning with 2 as far as N — i is (N* — i). 

Now consider the following cases of perfect negative rank 
correlation (i.e. p = — i): 



Order of Merit 

Order of Merit 

Difference in 

Case 

(rank) 

(rank) 

rank squared 

(N is odd) 

in subject P 

in subject R 

d* 

A 

I 

7 

6* 

B 

2 

6 

4‘ 

G 

3 

5 

a* 

D 

4 

4 

0* 

£ 

5 

3 

a» 

F 

6 

2 

4! 

G 

7 

I 

6» 

h 
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Case 

(N is even) 
A 
B 
C 
D 
E 
F 
G 
H 


Order of Merit 
(rank) 

in subject P 

I 

3 

3 

4 

5 

6 

7 

8 


Order of Merit 
(rank) 

in subject R 
8 

7 

C 

5 

4 

3 

2 

I 


Difference in 
Rank Squared 
rf* 

7 ’ 

5 * 

3 ’ 

I* 

I* 

3 * 

5 * 

7 ’ 


It will be seen that in both cases where there is perfect negative 
correlation 'Zd* - (N’ — i). Obviously when the ranks are 

identical and there is perfect positive correlation Zd^ = o; there- 
fore it is reasonable to suppose that when there is zero correlation 
(i.e. halfway between — i and 4 - i) that Zd' is halfway between 
JN(N* — i) and 0, i.e. JN(N>— i). 

Now if Lii’ were determined by chance alone (no correlation) 
it would have the value gN(N* — i). 


Zd> 


Thus .V, . gives a measure of the lack of association 

JN (N» - i) ^ 

between the ranks or the variance of the set of ranks. 


The correlation coefficient p = i — 

_ 6^^ 
N(N* - 


Z^ 

iN(N^- 


I) 


0 


which can be written 



APPENDIX V 


A NOTE ON CORRELATION AND 
REGRESSION LINES 


C ONSIDER N numbers A„ A*, Aa . . . Denote their mean by A 
and the differences or deviations of the numbers from their 
mean by n,, a, . . . etc., so that the mean of these is o. 


The stand.ird deviation is 



i 


1( Oa 1 the numbers a,, a„ a, are said to be in standard 
measure. (Alternatively, we tould have achieved the sune result 
by dividing the deviations from the mean by the standard 
deviation.) 

Consider a second set of N numbers Bi, B„ . . . and in the 
same way derive from them A „ i „ b, . . . and as the standard 
deviation of this set. 

Zab 

The coefficient of correlation Tbs = -•. by definition. 

NOb Ob 

Consider the identity 

ZZ(aA - a^bp)' = (2:a‘)(H*) - (Zab)* 

— N'au’as* (i — Tub*) 


It follows from this that if fa* = ± i, f** — for all values of 

bp Oq 

p and q giving a straight line rclationsliip between each A and the 
corresponding value of B. (Note that as the left-hand side cS the 
identity, being a square, cannot be negative, ras cannot lie outside 
the limits — t and + i.) 

Normally no such exact linear relation exists but we may find 
the line of best fit .by finding one which will make the sum of the 
squares of the (finances of points from it a minimum. 

»74 
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NOTE ON CORRELATION 

Choose ^ and u so tliat 'L{b — Xa _ (i)* jg a minimum. 
Differentiating partially with icspcct to ^ and P, we obtain 
— 2^a[b — — m) — o — — 2^(b — Aa — ti) 

which as Xn — o and similarly = o 
reduce to — ifSouabr^b — ANctu') = o = — 2( — Np) 

and thus A =. ?" and P = o 

a„ 

The line of regression of B on A is given by 
and the line of regression of A on B is given by 

( (^bTab\ f 

If the As .uvd Bs arc quite independent fab will approximate to 
zero if N is large enough. The convci sc is only true for a linear 
relationsltip. In the case of the parabolic curve b' = a, fab would 
equal o and we should use the correlation ratio instead of the 
coefficient. Thus, independence involves zero correlation, if N is 
large enough, but zero correlation does not necessarily imply 
independence. ‘ 

* Adapted from ‘Mathematics and PsjcholoRv’, Piugitio. Mathematical Gaxette, 
February 1933 This paper also vontams ‘An analysis of the factor g, it it exists' 
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AN EASY PROOF THAT THE COEFFICIENT OF 
CORRELATION IS LESS THAN UNITY 


Suppose that x and arc the standardized scores of a person in 
two subjects respectively and that there arc N persons taking the 
test. 

From the process of standardization it follows that Zx* and 
both equal N. 

The algebraic identity (x — _y)* = x* — 2xy f _>i* 

may be wTitten — x* +-_>>* — (*—>)* 

22xj' = Zx* + — Z(x — j>)' 

2lxy = N + N — Z(x — 

Ixy = N ~ iZ(x — >)* 

Z(x — y)* 

Correlation coefficient Jl — i ir— 

N aN 


But the square (x — >)* must always be positive 
Zjo> 

~ must always be less than i unless x =j) in which case the 

2,(x y)* 

term vanishes and Z«'’= N. 

aN 
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